C++ calls Microsoft's own speech recognition interface to get started quickly-C#.Net Tutorial-php.cn

Home

Backend Development

C#.Net Tutorial

C++ calls Microsoft's own speech recognition interface to get started quickly

PHPz

Apr 03, 2017 am 11:31 AM

c++interfaceSpeech Recognition

Quick Start with C++ Speech Recognition Interface (Microsoft Speech SDK)

I recently used Microsoft’s C++ speech recognition interface in my graduation project and searched a lot I also encountered many problems with the materials and took many detours. Now I write down my own experience, one is to improve myself, and the other is to repay the society. I hope that after reading this blog, you will learn how to implement the C++ speech recognition interface in 5 minutes. (The platform used is win8+VS2013)

1. Install SDK

Install MicrosoftSpeechPlatformSDK.msi, just install it to the default path.
Download path:
download.csdn.net/detail/michaelliang12/9510691

2. Create a new project and configure the environment

Settings:
1, Properties – Configuration properties –C/C++–General–Additional include directory: C:\Program Files\Microsoft SDKs\Speech\v11.0\Include (the specific path is related to the installation path)
2, Properties – Configuration Properties – Linker – Input – Additional dependencies: sapi.lib;

3. Speech recognition code

The speech recognition interface can be divided into text-to-speech and speech-to-text

1. Text-to-speech

Header files that need to be added:

#include <sapi.h> //导入语音头文件#pragma comment(lib,"sapi.lib") //导入语音头文件库

Function:

void  CBodyBasics::MSSSpeak(LPCTSTR speakContent)// speakContent为LPCTSTR型的字符串,调用此函数即可将文字转为语音{
    ISpVoice *pVoice = NULL;    //初始化COM接口

    if (FAILED(::CoInitialize(NULL)))
        MessageBox(NULL, (LPCWSTR)L"COM接口初始化失败！", (LPCWSTR)L"提示", MB_ICONWARNING | MB_CANCELTRYCONTINUE | MB_DEFBUTTON2);    //获取SpVoice接口

    HRESULT hr = CoCreateInstance(CLSID_SpVoice, NULL, CLSCTX_ALL, IID_ISpVoice, (void**)&pVoice);    if (SUCCEEDED(hr))
    {
        pVoice->SetVolume((USHORT)100); //设置音量，范围是 0 -100
        pVoice->SetRate(2); //设置速度，范围是 -10 - 10
        hr = pVoice->Speak(speakContent, 0, NULL);

        pVoice->Release();

        pVoice = NULL;
    }    //释放com资源
    ::CoUninitialize();
}

2. Voice to text

This is a little more troublesome because it requires real-time monitoring of the microphone, which involves to the messaging mechanism of Windows.
(1) First set the project properties:
Properties – Configuration Properties – C/C++ – Preprocessor – Preprocessor Definition: _WIN32_DCOM;

(2) Header files that need to be added:

#include <sapi.h> //导入语音头文件#pragma comment(lib,"sapi.lib") //导入语音头文件库#include //语音识别头文件#include //要用到CString#pragma onceconst int WM_RECORD = WM_USER + 100;//定义消息

(3) Define variables in the .h header file of the program

//定义变量CComPtr<ISpRecognizer>m_cpRecoEngine;// 语音识别引擎(recognition)的接口。CComPtr<ISpRecoContext>m_cpRecoCtxt;// 识别引擎上下文(context)的接口。CComPtr<ISpRecoGrammar>m_cpCmdGrammar;// 识别文法(grammar)的接口。CComPtr<ISpStream>m_cpInputStream;// 流()的接口。CComPtr<ISpObjectToken>m_cpToken;// 语音特征的(token)接口。CComPtr<ISpAudio>m_cpAudio;// 音频(Audio)的接口。(用来保存原来默认的输入流)ULONGLONG  ullGrammerID;

(4) Create a speech recognition initialization function (called when the program just starts executing, for example, in the example code at the end of the article, This initialization function is placed in the response code of the dialog box initialization message WM_INITDIALOG)

//语音识别初始化函数void  CBodyBasics::MSSListen()
{    //初始化COM接口

    if (FAILED(::CoInitialize(NULL)))
        MessageBox(NULL, (LPCWSTR)L"COM接口初始化失败！", (LPCWSTR)L"提示", MB_ICONWARNING | MB_CANCELTRYCONTINUE | MB_DEFBUTTON2);


    HRESULT hr = m_cpRecoEngine.CoCreateInstance(CLSID_SpSharedRecognizer);//创建Share型识别引擎
    if (SUCCEEDED(hr))
    {


        hr = m_cpRecoEngine->CreateRecoContext(&m_cpRecoCtxt);//创建识别上下文接口

        hr = m_cpRecoCtxt->SetNotifyWindowMessage(m_hWnd, WM_RECORD, 0, 0);//设置识别消息

        const ULONGLONG ullInterest = SPFEI(SPEI_SOUND_START) | SPFEI(SPEI_SOUND_END) | SPFEI(SPEI_RECOGNITION);//设置我们感兴趣的事件
        hr = m_cpRecoCtxt->SetInterest(ullInterest, ullInterest);

        hr = SpCreateDefaultObjectFromCategoryId(SPCAT_AUDIOIN, &m_cpAudio);
        m_cpRecoEngine->SetInput(m_cpAudio, true);        //创建语法规则
        //dictation听说式
        //hr = m_cpRecoCtxt->CreateGrammar(GIDDICTATION, &m_cpDictationGrammar);
        //if (SUCCEEDED(hr))
        //{
        //  hr = m_cpDictationGrammar->LoadDictation(NULL, SPLO_STATIC);//加载词典
        //}

        //C&C命令式，此时语法文件使用xml格式
        ullGrammerID = 1000;
        hr = m_cpRecoCtxt->CreateGrammar(ullGrammerID, &m_cpCmdGrammar);

        WCHAR wszXMLFile[20] = L"";//加载语法
        MultiByteToWideChar(CP_ACP, 0, (LPCSTR)"CmdCtrl.xml", -1, wszXMLFile, 256);//ANSI转UNINCODE
        hr = m_cpCmdGrammar->LoadCmdFromFile(wszXMLFile, SPLO_DYNAMIC);        //MessageBox(NULL, (LPCWSTR)L"语音识别已启动！", (LPCWSTR)L"提示", MB_CANCELTRYCONTINUE );
        //激活语法进行识别
        //hr = m_cpDictationGrammar->SetDictationState(SPRS_ACTIVE);//dictation
        hr = m_cpCmdGrammar->SetRuleState(NULL, NULL, SPRS_ACTIVE);//C&C
        hr = m_cpRecoEngine->SetRecoState(SPRST_ACTIVE);

    }    else
    {
        MessageBox(NULL, (LPCWSTR)L"语音识别引擎启动出错！", (LPCWSTR)L"警告", MB_OK);        exit(0);
    }    //释放com资源
    ::CoUninitialize();    //hr = m_cpCmdGrammar->SetRuleState(NULL, NULL, SPRS_INACTIVE);//C&C}

(5) Define the message processing function
needs to be placed together with other message processing codes, such as in the code of this article, placed The end of the DlgProc() function of the sample code at the end of the article. The entire other code blocks in this article can be copied directly. You only need to change the following Message response module

//消息处理函数USES_CONVERSION;
    CSpEvent event;    if (m_cpRecoCtxt)
    {        while (event.GetFrom(m_cpRecoCtxt) == S_OK){            switch (event.eEventId)
            {            case SPEI_RECOGNITION:
            {                                     //识别出了语音
                                     m_bGotReco = TRUE; 

                                     static const WCHAR wszUnrecognized[] = L"<Unrecognized>";

                                     CSpDynamicString dstrText;                                     ////取得识别结果 
                                     if (FAILED(event.RecoResult()->GetText(SP_GETWHOLEPHRASE, SP_GETWHOLEPHRASE, TRUE, &dstrText, NULL)))
                                     {
                                         dstrText = wszUnrecognized;
                                     }

                                     BSTR SRout;
                                     dstrText.CopyToBSTR(&SRout);
                                     CString Recstring;
                                     Recstring.Empty();
                                     Recstring = SRout;                                    //做出反应（*****消息反应模块*****）
                                    if (Recstring == "发短信")
                                     {                                         //MessageBox(NULL, (LPCWSTR)L"好的", (LPCWSTR)L"提示", MB_OK);
                                         MSSSpeak(LPCTSTR(_T("好，马上发短信！")));

                                     }                                     else if (Recstring == "李雷")
                                     {
                                         MSSSpeak(LPCTSTR(_T("好久没看见他了，真是 long time no see")));
                                     }   

            }                break;
            }
        }
    }

(6) Modify the syntax file
Modify the CmdCtrl.xml file, you can Improve the recognition of certain words, and the recognition effect of the words inside will be much better, such as people's names, etc. (In addition, when running the exe alone, you also need to put this file in the same folder as the exe. If you do not put it, no error will be reported, but the vocabulary recognition effect in the grammar file will become worse)

<?xml version="1.0" encoding="utf-8"?><GRAMMAR LANGID="804">
  <DEFINE>
    <ID NAME="VID_SubName1" VAL="4001"/>
    <ID NAME="VID_SubName2" VAL="4002"/>
    <ID NAME="VID_SubName3" VAL="4003"/>
    <ID NAME="VID_SubName4" VAL="4004"/>
    <ID NAME="VID_SubName5" VAL="4005"/>
    <ID NAME="VID_SubName6" VAL="4006"/>
    <ID NAME="VID_SubName7" VAL="4007"/>
    <ID NAME="VID_SubName8" VAL="4008"/>
    <ID NAME="VID_SubName9" VAL="4009"/>
    <ID NAME="VID_SubNameRule" VAL="3001"/>
    <ID NAME="VID_TopLevelRule" VAL="3000"/>
  </DEFINE>
  <RULE ID="VID_TopLevelRule" TOPLEVEL="ACTIVE">
    <O>
      <L>
        <P>我要</P>
        <P>运行</P>
        <P>执行</P>
      </L>
    </O>
    <RULEREF REFID="VID_SubNameRule" />
  </RULE>
  <RULE ID="VID_SubNameRule" >
    <L PROPID="VID_SubNameRule">
      <P VAL="VID_SubName1">发短信</P>
      <P VAL="VID_SubName2">是的</P>
      <P VAL="VID_SubName3">好的</P>
      <P VAL="VID_SubName4">不用</P>
      <P VAL="VID_SubName5">李雷</P>
      <P VAL="VID_SubName6">韩梅梅</P>
      <P VAL="VID_SubName7">中文界面</P>
      <P VAL="VID_SubName8">英文界面</P>
      <P VAL="VID_SubName9">English</P>

    </L>
  </RULE></GRAMMAR>

The above is the detailed content of C++ calls Microsoft's own speech recognition interface to get started quickly. For more information, please follow other related articles on the PHP Chinese website!

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Developing with C# .NET: A Practical Guide and ExamplesMay 12, 2025 am 12:16 AM

C# and .NET provide powerful features and an efficient development environment. 1) C# is a modern, object-oriented programming language that combines the power of C and the simplicity of Java. 2) The .NET framework is a platform for building and running applications, supporting multiple programming languages. 3) Classes and objects in C# are the core of object-oriented programming. Classes define data and behaviors, and objects are instances of classes. 4) The garbage collection mechanism of .NET automatically manages memory to simplify the work of developers. 5) C# and .NET provide powerful file operation functions, supporting synchronous and asynchronous programming. 6) Common errors can be solved through debugger, logging and exception handling. 7) Performance optimization and best practices include using StringBuild

C# .NET: Understanding the Microsoft .NET FrameworkMay 11, 2025 am 12:17 AM

.NETFramework is a cross-language, cross-platform development platform that provides a consistent programming model and a powerful runtime environment. 1) It consists of CLR and FCL, which manages memory and threads, and FCL provides pre-built functions. 2) Examples of usage include reading files and LINQ queries. 3) Common errors involve unhandled exceptions and memory leaks, and need to be resolved using debugging tools. 4) Performance optimization can be achieved through asynchronous programming and caching, and maintaining code readability and maintainability is the key.

The Longevity of C# .NET: Reasons for its Enduring PopularityMay 10, 2025 am 12:12 AM

Reasons for C#.NET to remain lasting attractive include its excellent performance, rich ecosystem, strong community support and cross-platform development capabilities. 1) Excellent performance and is suitable for enterprise-level application and game development; 2) The .NET framework provides a wide range of class libraries and tools to support a variety of development fields; 3) It has an active developer community and rich learning resources; 4) .NETCore realizes cross-platform development and expands application scenarios.

Mastering C# .NET Design Patterns: From Singleton to Dependency InjectionMay 09, 2025 am 12:15 AM

Design patterns in C#.NET include Singleton patterns and dependency injection. 1.Singleton mode ensures that there is only one instance of the class, which is suitable for scenarios where global access points are required, but attention should be paid to thread safety and abuse issues. 2. Dependency injection improves code flexibility and testability by injecting dependencies. It is often used for constructor injection, but it is necessary to avoid excessive use to increase complexity.

C# .NET in the Modern World: Applications and IndustriesMay 08, 2025 am 12:08 AM

C#.NET is widely used in the modern world in the fields of game development, financial services, the Internet of Things and cloud computing. 1) In game development, use C# to program through the Unity engine. 2) In the field of financial services, C#.NET is used to develop high-performance trading systems and data analysis tools. 3) In terms of IoT and cloud computing, C#.NET provides support through Azure services to develop device control logic and data processing.

C# .NET Framework vs. .NET Core/5/6: What's the Difference?May 07, 2025 am 12:06 AM

.NETFrameworkisWindows-centric,while.NETCore/5/6supportscross-platformdevelopment.1).NETFramework,since2002,isidealforWindowsapplicationsbutlimitedincross-platformcapabilities.2).NETCore,from2016,anditsevolutions(.NET5/6)offerbetterperformance,cross-

The Community of C# .NET Developers: Resources and SupportMay 06, 2025 am 12:11 AM

The C#.NET developer community provides rich resources and support, including: 1. Microsoft's official documents, 2. Community forums such as StackOverflow and Reddit, and 3. Open source projects on GitHub. These resources help developers improve their programming skills from basic learning to advanced applications.

The C# .NET Advantage: Features, Benefits, and Use CasesMay 05, 2025 am 12:01 AM

The advantages of C#.NET include: 1) Language features, such as asynchronous programming simplifies development; 2) Performance and reliability, improving efficiency through JIT compilation and garbage collection mechanisms; 3) Cross-platform support, .NETCore expands application scenarios; 4) A wide range of practical applications, with outstanding performance from the Web to desktop and game development.

See all articles