Sign up to our newsletter below.

First name
Last name
Email
Company
Job Title
Industry
We see you're using an obsolete browser. For a better experience when browsing The New Economy, and for a better web, please consider switching to a newer browser. For more information on popular browsers please see browsehappy.com.
Digital editions
Link to digital editions
Link to Digital Symphony
Link to SAP
Link to IBM
Link to Ingenuity Lab
Link to Carnival Corporation
Link to The New Economy Awards 2016

  • Sustainable Innovation Forum 2016
  • Cloud Computing Forum 2015
  • World Pension Sumit 2015
  • Broadband World Forum 2015
  • Mobile World Congress

Insights

Uber has been a major success story, reaching a valuation of $50bn after just five years. But it’s not been a smooth ride and the firm is engaged in regulatory battles around the world

What problems does Uber currently face?

Uber has been a major success story, reaching a valuation of $50bn after just five years. But it’s not been a smooth ride and the firm is engaged in regulatory battles around the world

Social network participation among Fortune 500 CEOs Positive trends point to a US embrace of solar power

Speech recognition software now as good as humans

Microsoft has announced it has developed speech recognition software that is as accurate as a professional transcriptionist
Microsoft's system has a similar error rate to that of a human transcriptionist

Microsoft has announced it has developed speech recognition software that is as accurate as a professional transcriptionist

Microsoft has developed a piece of software that can transcribe a conversation as accurately as a human, in a significant breakthrough for artificial intelligence systems. The software not only listens to words, but also is able to place the words within context to allow for more accurate transcriptions.

The latest program from Microsoft’s research team is capable of transcribing a conversation with a word error rate of 5.9 percent, a figure comparable to the error rate of a professional transcriptionist. Xuedong Huang, Microsoft’s chief speech scientist, said in a statement this represents a historic achievement.

The result doesn’t represent perfect speech transcription, but rather offers something very close to the way humans mishear fragments of conversations. Mistakes are generally quite straightforward, such as confusing ‘have’ for ‘is’ or ‘a’ for ‘the’. Transcription mistakes from both humans and this system come from minor misinterpretations of sentences rather than a physical mishearing.

The result doesn’t represent perfect speech transcription, but rather offers something very close to the way humans mishear fragments of conversations

The breakthrough that led to this achievement is the use of a neural network and the grouping of words that not just sound similar, but have similar meanings. For example, the words ‘fast’ and ‘quick’ are close together in the virtual dictionary of the neural network, since the use of one increases the likelihood of the other. This lets the system generalise meaning in the same way a human might.

What the system cannot do is understand what it is listening to. While it is able to accurately transcribe speech, it does not understand what is being said, so cannot, for example, answer a question.

The primary use of the software is likely to be in Microsoft products that use speech recognition, like the Xbox and the Cortana virtual assistant. The next stage for the research team is to modify the system so it can still function in places with a large amount of background noise, or listen to multiple voices.