Microsoft Azure Cognitive Services – Part I

Artificial intelligence has become the latest tech buzzword everywhere.  But, what is ”Artificial intelligence”?

It is a broad and general term that refers to any type of computer software that engages in human-like activities, including learning, planning and problem-solving.  Artificial intelligence is already widely used in business applications, like automation, data analytics, and natural language processing across industries.

Enterprise use of AI grew 270% over the past 4 years according to a report by Gartner. Are you planning to implement AI in your organization as well?

Microsoft has invested a lot in this area over almost 18 years, offering APIs, SDKs, and services like Azure Cognitive Services to help developers. Using Cognitive Services, developers can build AI applications; as Cognitive Services allow adding into apps Cognitive features like seeing, hearing, speaking, understanding, and even beginning to reason.

Azure Cognitive Services are not only versatile and compatible with almost all Azure resources, but they also allow developers to perform complex deployments that comply with business requirements. Cognitive Services has been awarded certifications such as CSA STAR Certification, FedRAMP Moderate, and HIPAA BAA.

There are Azure Cognitive Services designed to process images and return information. These services are useful in scenarios such as content moderation, text recognition, health care, etc.

The following list shows the Azure Vision APIs and describes some of them:

  • Computer Vision
  • Custom Vision Service
  • Face
  • Form Recognizer
  • Ink Recognizer
  • Video Indexer

Computer Vision API

Computer vision API analyzes images and returns information about them.  It provides developers access to advanced algorithms for processing images and returning information like insights about their visual features and characteristics.  You can use Computer Vision in your application through a native SDK or by invoking the REST API directly.  You can use Computer Vision Read API to extract printed and handwritten text from images into a machine-readable character stream; and you can also use the optical character recognition (OCR) API to extract printed text in several languages.

Video indexer API

This API enables you to extract insights from videos.  It can create audio transcriptions, translate audio to other languages, generate search metadata from videos, and much more.  This service enables deep search, reduces operational costs, enables new monetization opportunities, and creates new user experiences on large archives of videos.  Additionally, it can use textual and visual as content moderation to keep users safe from inappropriate content.

Nowadays, we interact with virtual assistants like Microsoft Cortana, Amazon Alexa, Google assistant, Aura by Telefonica to name just a few.  All those services have something in common, they use Speech Cognitive services to recognize us while we speak with them.  The following list shows the Azure Speech APIs and describes some of them:

  • Speech service
  • Speaker Recognition API
  • Bing Speech
  • Translator Speech

Speaker Recognition API

The Microsoft Azure Cognitive Services Speaker Recognition API identifies and authenticates individual users by using voice. It can be used to add a layer of security to third party applications with a speech verification tool.  Speaker Recognition is divided into two categories: speaker verification and speaker identification.  Voice has unique characteristics that can be associated with an individual.  The Speaker Recognition API uses JSON for data exchange and API Keys for authentication.

Speech Services API

Speech service is the unification of speech-to-text, text-to-speech, and speech-translation. Speech Services API enables the real-time transcription of audio streams into text.  It allows your applications, tools, or devices to consume, display, and act upon the command input to the Speech-to-text service.  This service is powered by the recognition technology that is used for Cortana.  It provides a way to capture audio from a microphone, read from a stream, and more.  In addition to the standard Speech service model, you can create custom models. Customization helps to overcome speech recognition barriers such as speaking style, vocabulary and background noise.

Language API

To add natural language understanding to applications, Azure provides Language APIs.   Thus, developers can integrate it with chatbots, use them to perform Text Analytics, IoT Devices, and more. The following list shows the Azure Speech APIs and describes some of them:

  • Language Understanding LUIS
  • QnA Maker
  • Text Analytics
  • Translator Text

Language Understanding (LUIS) is a service that applies custom machine-learning intelligence to natural language text to predict overall meaning and pull out relevant, detailed information.  LUIS can be integrated into apps, bots, and IoT devices.  LUIS can understand around 15 languages.  LUIS apps contain domain-specific natural language models, which work together.  You can start the LUIS app with one or more prebuilt models, build your own model, or blend prebuilt models with your own custom information.  LUIS can be used with any product, service, or framework with an HTTP request.

Azure text analytics is part of the language tools and provides language processing over raw text with these main functions: sentiment analysis, key phrase extraction, named entity recognition and language detection.  The API can be used to analyze unstructured text.  No training data is needed to use this API; just bring your text data.  This API uses advanced natural language processing techniques to deliver the best in class predictions.

We will continue to elaborate on the rest of cognitive services and their APIs in our next blog post. Let us know in comments how this blog post has been helpful in planning your roadmap towards deploying AI in your organization.

If you want to continue learning about Azure cognitive services, take a look at the second post from these series, where we list the main Cognitive Services APIs. Click here to read more.

For any questions and queries, feel free to reach out at



More Posts


Connect with us