GenreDetector app by SoftTeco

Our team continues to explore new technologies and innovative opportunities in development. Today we want to share our interesting experience of creating the GenreDetector app. Essentially, we trained the model to recognize music genres by uploading samples via the Create ML application.

At first, 10 genres of music were used as a basis and one dissimilar genre was used for background sounds recognition (noise, crackling, humming, silence). The sound model was educated to recognize specific sounds (for example, the sound of a guitar, applause, sneezing, drums), but not the entire musical composition. Therefore, in each folder there are several sound files specifically related to each genre. The more files we have in the folder the better-trained final model would be. 

Next, we dragged the sample folder into Create ML and then clicked on the play button, so the model would start learning. After 5-10 minutes, our model was trained and it was already possible to test it, input a music file, and have the model start listening. Sound recognition worked well in online mode. The name appeared on a display was similar to the name we called the folder with the current genre.

Here is an important note: if the results are not satisfactory, it’s better to select more suitable samples.

As a next step, we implemented the model, dragged the model into the project itself, and created a class with the same name as the model.

Then, we connected the SoundAnalysis framework for sound analysis and recognition.
To work on a task like this, you also need the AVKit Framework for working with media. We used AVAudioEngine, and with the help of this tool, we implemented the recognition of sound from the device’s microphone online.

Additionally, we translated RMS to decibels. The decibel values ​​in iOS have a range from -160 dB, close to silent, to 0 dB, the maximum power. We analyzed a sound that exceeded -40 decibels to weed out extraneous sounds.

The task was to recognize the finished audio file, and thus all of the abovementioned steps were enough to immediately begin the analysis.

 

Finally, an observer was required for processing results. As soon as the analysis was completed, we calculated the percentage of the genre in the original sound recording.

For the finished sound file:

For online recognition:

Then, we updated Lable with genre name:

     .   

 

Finally, after the completion of GenreDetector app development, we continue to investigate possible use cases for Create ML and real-world applications for this technology.