Emergency siren sound identification in traffic

Hello all,

I’m not sure whether this is the best place to post this, as it’s not yet a “project”, more of a “project idea”. I’m curious what others think about its feasibility, and the idea itself. It’s more of a subject for an open discussion.

Here it is:

Sometimes when one’s driving a car in traffic, there is an emergency vehicle somewhere and the siren can be heard. It is often unclear, especially in heavy traffic, where the sound is coming from. My idea would be:

Step 1) identify that there is a siren sound and display that information to the driver (the idea here is that the device will hear and identify such signal before the driver does, so it’s a “heads up”)

Step 2) identify the direction the sound is coming from, and display that information: this would help moving away from the emergency vehicle’s path in case it’s coming from behind us, or avoid getting in its way when it’s approaching the same intersection as we are (i.e. stopping before entering the intersection, etc.)

Step 2) would require a microphone array and seems very complicated, so the focus is on identifying the sound signal itself first.

It seems relatively difficult, for multiple reasons:

  • a) various countries and various emergency services may have different signals, and it’s not easy to find out the specs.

  • b) the siren sound by its very nature is quite variable due to the Doppler effect: the frequency rises as the sound source is getting closer and falls when it’s moving away. In case of a siren in traffic it can be heard very clearly.

As for point a), I was able to find some information that might be valid for various countries: there are three signals, called “hi-lo”, “wail” and “yelp”. “Hi-lo” has the sound frequency modulated by a square wave between 950 and 1150 Hz, seems like.

“Yelp” is between 500 and 2000 Hz, frequency modulated by a low frequency triangular wave, and “wail” is 1800 Hz down to 600 Hz, driven by a sawtooth LFO. At least that’s what I got from that description. The frequency of these oscillations is low, the longest cycle is “yelp”, which takes 8 seconds to go all the way down and then up again.

The ML model would need to identify this kind of wavy pattern when it appears in the total spectrum captured by the microphone, instead of looking for specific frequencies. It might create a distinct pattern in the spectrogram, but I don’t know that yet.

The dataset: I believe it would need to be synthesized, the siren part. Collecting a vast number of traffic sound samples is one thing, getting many samples of a siren coming and going from different directions is another thing, but it could be synthesized programmatically or electronically, and then mixed with traffic noise samples.

I’m not sure how feasible that is, or how useful: given the long time window used for capturing the signal, it might be pointless. And the model could be either inaccurate, or big.
What do you all think?

1 Like

Hi @nebelgrau77,

Indeed a great application. It seems to be very relevant in busy cities. Often I have witnessed drivers getting confused when these emergency vehicles pass by.

I would like to add that the loudness also seems to be a good feature to include here. Usually the emergency vehicle sounds are distinct from the honking or other sounds (wrt loudness). Probably we can utilize loudness as well as the wave pattern specific to the siren.

Regarding datasets, I have come across traffic noise datasets often telling the loudness levels. I am not sure if they store the waveforms. In India, major cities have fixed sound level monitors along highways to check noise pollution. Recently one of my co-researcher has tried noise profiling using portable sound meters.

It’s all good, we can figure these categories out later. Thanks for sharing the idea! This is how things start.

So is this something you think we can put into vehicles, so that somehow the vehicles are smart and can help us “navigate” the situation? The reason I ask is that often I’m stumped when I’m in the car and hear this happening. I keep looking around and trying to gauge the situation – worse when my kids are screaming in the back seat! :slight_smile:

@nebelgrau77 Very cool idea! Some of the sound effect sets have actual recordings…for example this one from Europe. I’m not sure how their licencing would apply to this, you might have to contact the copyright owner:

https://www.asoundeffect.com/sound-library/sirens/

For the USA (and by extension Canada), you might be able to contact the major siren manufacturers (for example Federal Signal). They might be fine with you using their recordings:

https://www.fedsig.com/file-extension/mp3

Thank you all for the feedback, I’m glad you liked the idea!

@Anavij: Yes, I had big cities in mind in particular. For me the kinda scary part is when I’m approaching an intersection, hear a siren and I’m not sure where it’s coming from: you don’t want to enter an intersection with an emergency vehicle coming from left or right!
You’re right about the loudness being above the background level, but I’m not sure whether it could be used as a feature here: the whole point is to “hear” the signal before the driver does, so when the sound level of the siren is still relatively low compared to the other noises.
I haven’t looked for datasets yet, but that would be a minor problem: I happen to live in a fairly big city with intense traffic, so I could always record a lot myself :slight_smile:

@vjreddi: The way I see it, it’s more of an extension of our senses, but in a smart way: an extra ear tuned to particular sounds. Which is why I believe the microphone would have to be placed on the outside rather than inside, precisely to avoid all those other sounds you normally have in the car: your loud passengers, the radio, etc. :slight_smile: It would be an Advanced Driver-Assistance System, like the sensors detecting objects in the blind spot and such.

@SecurityGuy: Thanks, those could work, too! But I still think the main focus would need to be on getting the actual traffic noise waveforms, while the sirens could be synthesized and then mixed with the real sounds. The reason for that is that this way the Doppler effect could be easily taken into account. There are many software synthesizers nowadays, and those siren sounds are fairly simple to reproduce. But the Federal Signal files are very interesting, thanks!

I will give it some more thought, then :slight_smile: Thanks!