Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Differentiating human voice from other sounds like horn, tapping, bird (Read 4886 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Differentiating human voice from other sounds like horn, tapping, bird

Hi all :-),

I am currently working on an academic project to find the human speech in the varying SNR conditions. I have implemented a method (using SNR band energy and SNR peaks in frequency domain) and is working fine to detect the voice activity but failing to detect only human speech. Currently I am failing to detect only human speech. I have tried with the speech feature extraction, but not able to make the decisions as thresholds values are varying for different environments. 

Please do some one suggest how to detect only human voice activity. Any suggestions will be very helpful.

Thank you,
ksam917

Differentiating human voice from other sounds like horn, tapping, bird

Reply #1
If it's an academic project you should have access to many relevant journals and papers on the same or similar subjects which will get you started.

Differentiating human voice from other sounds like horn, tapping, bird

Reply #2
Just a hint - thresholds should not be absolute, they should be adaptive/relative. Absolute thresholds work only in a well defined/controlled environment. Have you tried some sort of normalization of the extracted features?

Differentiating human voice from other sounds like horn, tapping, bird

Reply #3
If it's an academic project you should have access to many relevant journals and papers on the same or similar subjects which will get you started.


I have access to Journal, but not finding the suitable ones, and confused with from where to start and ow to proceed.

Differentiating human voice from other sounds like horn, tapping, bird

Reply #4
Just a hint - thresholds should not be absolute, they should be adaptive/relative. Absolute thresholds work only in a well defined/controlled environment. Have you tried some sort of normalization of the extracted features?


Yeah I agree with you. The set thresholds are varying for the different environments. if we do normalization also these values will keep on varying right. How to set the adaptive  thresholds?

Differentiating human voice from other sounds like horn, tapping, bird

Reply #5
Consult your academic supervisor. Doing an academic project without any sort of supervision/leadership sucks and is a huge waste of time, in my opinion (been there, done that).