Voice Recognition

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Voice Recognition

al.soto
Hi all,

I'm trying to use Dragon Naturally Speaking or even Windows 7 speech recognition inside of ImageJ to convert audio to text and save that text as metadata for the current image.

Any suggestions on how to do that?

Alejandro
Reply | Threaded
Open this post in threaded view
|

Re: Voice Recognition

Andreas Maier
Hi Alejandro,

As far as I know Dragon Naturally Speaking will send the recognized text
to a text field. All you would have to
provide should be the correct widget which understands the input from
Dragon Naturally Speaking.

Or are you talking about prerecorded speech which you now want to assign
as meta data? Is it your speech, i.e. is the speech recognition system
trained to that speaker?

Best,

Andreas

Am 8/2/2010 12:04 PM, schrieb al.soto:

> Hi all,
>
> I'm trying to use Dragon Naturally Speaking or even Windows 7 speech
> recognition inside of ImageJ to convert audio to text and save that text as
> metadata for the current image.
>
> Any suggestions on how to do that?
>
> Alejandro
>    


--
Dr.-Ing. Andreas Maier
Stanford University
Department of Radiology
The Lucas Center for Imaging
Mail Code 5488, Route 8
Stanford, CA  94305
http://med.stanford.edu/profiles/Andreas_Maier/
Reply | Threaded
Open this post in threaded view
|

Re: Voice Recognition

al.soto
Hi Andreas,

The way that this would go is as follows:
Do experiment (acquiring images)
While doing experiment speak the name of experiment and relevant information
Finally save the file with the spoken name
Save rest of speech as metadata.

Where would I find the widget? Or would it have to be written?

Thanks

Alejandro
Reply | Threaded
Open this post in threaded view
|

Re: Voice Recognition

Andreas Maier
Hi Alejandro,

as far as I remember, Dragon Naturally Speaking has a batch mode to
process prerecorded files. So basically you would need a macro or a
custom plug-in which invokes the Dragon Speech Reco engine, parses the
output and puts it into the meta header.

I guess you will have some recognition errors during this process.
Hence, you may want to keep the audio files as it may occur that the
text version is not intelligible.

In a custom plug-in you could even do the voice recording via Java. For
someone who knows the respective APIs, I guess it would be two or three
days of work to develop such a plug-in.

Maybe there is even chance to store the audio data within the image
header. This of course depends on the image format.  Which one are you
using?

Btw, what names are you referring to? Names of patients for example may
pose a serious problem to the speech reco engine...

Best,

Andreas

Am 8/2/2010 5:32 PM, schrieb al.soto:

> Hi Andreas,
>
> The way that this would go is as follows:
> Do experiment (acquiring images)
> While doing experiment speak the name of experiment and relevant information
> Finally save the file with the spoken name
> Save rest of speech as metadata.
>
> Where would I find the widget? Or would it have to be written?
>
> Thanks
>
> Alejandro
>    


--
Dr.-Ing. Andreas Maier
Stanford University
Department of Radiology
The Lucas Center for Imaging
Mail Code 5488, Route 8
Stanford, CA  94305
http://med.stanford.edu/profiles/Andreas_Maier/
Reply | Threaded
Open this post in threaded view
|

Re: Voice Recognition

al.soto
Hi Andreas,

This is designed for doing experiments on cell cultures. I also will be saving images as tiff files. Is there an image format that can use audio data?

As for voice recording in Java, do you mean just recording voice data and saving as a .wav file? I have to look closely to see how to used Dragon Naturally Speaking.

Andreas Maier wrote
as far as I remember, Dragon Naturally Speaking has a batch mode to
process prerecorded files. So basically you would need a macro or a
custom plug-in which invokes the Dragon Speech Reco engine, parses the
output and puts it into the meta header.

In a custom plug-in you could even do the voice recording via Java. For
someone who knows the respective APIs, I guess it would be two or three
days of work to develop such a plug-in.
I'm still pretty new to Java and ImageJ plugins but is it really possible to invoke the Dragon Speech Reco engine? Would you happen to know how to do that?

Thanks for all your help

Alejandro
Reply | Threaded
Open this post in threaded view
|

Re: Voice Recognition

Andreas Maier
Hi Alejandro,
> This is designed for doing experiments on cell cultures. I also will be
> saving images as tiff files. Is there an image format that can use audio
> data?
>    

I guess one could store some information in the private fields of the
tiff header. Maybe it would be better to reference the location of an
additional audio file... I'm no expert for image formats..
> As for voice recording in Java, do you mean just recording voice data and
> saving as a .wav file? I have to look closely to see how to used Dragon
> Naturally Speaking.
>    

I guess wou could just write your own speech recorder using the Java
sound API:

http://java.sun.com/products/java-media/sound/

Alternatively you could also use some open-source recorder such as:

http://www.quadmore.com/swingrecorder/

Both should give you the capabilities to record audio via Java.

>
> Andreas Maier wrote:
>    
>> as far as I remember, Dragon Naturally Speaking has a batch mode to
>> process prerecorded files. So basically you would need a macro or a
>> custom plug-in which invokes the Dragon Speech Reco engine, parses the
>> output and puts it into the meta header.
>>
>> In a custom plug-in you could even do the voice recording via Java. For
>> someone who knows the respective APIs, I guess it would be two or three
>> days of work to develop such a plug-in.
>>
>>      
> I'm still pretty new to Java and ImageJ plugins but is it really possible to
> invoke the Dragon Speech Reco engine? Would you happen to know how to do
> that?
>
> Thanks for all your help
>
> Alejandro
>    

I guess you could use the batch interface of Dragon Naturally Speaking.
There should be some command line interface which you can invoke via
Runtime.exec.
There is also a complete Java Speech Recognition Engine available:
http://cmusphinx.sourceforge.net/sphinx4/
But I guess, they probably don't have the nice acoustic models which are
supplied with dragon's speech recognition engine. (And it will most
likely not be able to read Dragon's models).

In any case it's quite a bit of work. (Using the Java engine will be
even more work, but of course the most beautiful solution.)

Best,

Andreas


--
Dr.-Ing. Andreas Maier
Stanford University
Department of Radiology
The Lucas Center for Imaging
Mail Code 5488, Route 8
Stanford, CA  94305
http://med.stanford.edu/profiles/Andreas_Maier/
Reply | Threaded
Open this post in threaded view
|

Re: Voice Recognition

seanhenry
In reply to this post by al.soto
dragon speech recognition software was good and you can use it.If u are going to convert audio related to  business means that software wont give that much exact result not even any speech recognition software give.You can use  that software for your personal use.so am using www.transcriptionsservice.com  service for converting my interview audio to text.