Anomaly Detection and Water Quality

Hi All,

One of the use cases I’ve been thinking about that could benefit from TinyML is the discovery of water quality issues and potentially heading off larger problems before they happen by using ML to make inferences on the sample data collected.

However, before you can start considering ML models, you need to determine how you are going to collect the relevant data needed to make these inferences to detect harmful substances and you need lots of historical data to build a solid model. A large amount of historical data may not exist and may need to be collected.

The specific use case I want to focus on is the detection of cyanobacteria cells and cyanotoxins. In the part of Florida (USA) where I live, we are dealing with an environmental challenge whereby too much nutrient-rich fresh water is making its way into brackish and saltwater ecosystems.

Fertilizers and other nutrients (phosphorus and nitrogen) make their way into lakes and streams from water runoff on farm land This nutrient rich water creates algae blooms that stifle proper water flow and kill fish and plants by removing needed oxygen.

In the US, the Dept of Interior, USGS (USGS Water Data for the Nation), manages water quality testing nationwide. They have water testing stations that automatically record the following data elements on the daily basis:

Temperature, water, degrees Celsius - TOP
Temperature, water, degrees Celsius – BOTTOM
Gage height, feet
Specific conductance, water, unfiltered, microsiemens per centimeter at 25 degrees Celsius – TOP
Specific conductance, water, unfiltered, microsiemens per centimeter at 25 degrees Celsius – BOTTOM
Salinity, water, unfiltered, parts per thousand

However, I am not aware of water chemistry being tracked in an automated fashion. My understanding is that these tests are done manually by field testers.

Does anyone in our community have the knowledge or access to someone with a good understanding in hydrology and water chemistry, and sensors used to measure nutrients levels? It’s way outside my area of expertise.



Hi, Cam,

I was about to post on a related topic: applying ML to water-quality (WQ) management of sustainable recirculating aquaculture systems (RAS).

I’m considering how I might apply ML to enhance a WQ app I’m developing. Vijay’s courses provided the vocabulary, concepts, and an intro to his domain’s literature that really crystalized my ideas.

I likely am facing some of the same issues as you, such as how I might generate synthetic data to augment generally sparse RAS datasets.

FYI, I summarized my current approach for some colleagues. Instead of “summarizing that summary” here, if you’re interested, you’ll find it in the “About…” menu under “ML & The WQ Map” at this link:

(NB: This test version lives on Heroku’s free tier, so it’s allocated limited server resources. That makes it slow to wake up, but it runs normally afterwards.)



You have done some great work here! Thanks so much for sharing. Let’s connect on LinkedIn and we can discuss further. Not sure where you are based (“peaceful-fjord” holds some clues ;)), but I have direct access to water management resources in South Florida, an area that is being gravely impacted by water quality issues.



Thanks for the compliment, Cam.

Let’s connect on LinkedIn…

I’m happy to discuss this further, but I’m not on social media. If you e me at, we can move forward from there.

Not sure where you are based (“peaceful-fjord”…

I’m in W. PA.

(Heroku’s default naming is random [adjective]-[noun]-[number]. Thus “peaceful-fjord-26178…” After testing, I’ll “christen” the apps with descriptive names.)

Some of the water-quality problems that arise in aquaculture/aquaponics/hydroponics are homologous to those in the home aquarium, swimming pool/spa, & wastewater treatment sectors. I’ll be interested to learn if there are any applications for the software in your domain.