Measuring deployed TF lite model performance

How to measure deployed TF lite models accuracy to decide if they require retraining for re-deployment purpose automatically without human intervention to evaluate its accuracy/performance?