R&D Scientist at UtopiaCompression Corporation

Doctor of Philosophy - University of Florida, USA (2018 -  2022)

Master of Science - University of Florida, USA (2016 - 2018) 

Bachelor of Technology - Vellore Institute of Technology, India (2010 - 2014) 

LinkedInTwitterGitHub

Contact:     rish283@gmail.com

Senior ML engineer interested in roles related to machine learning, computer vision, uncertainty quantification & AI safety. My PhD research involved developing frameworks to predict failure probability of AI based systems (particularly deep learning models) in real-time, with the goal of increasing user trust in such systems.

Hello! My name is Rishabh Singh. I am a senior machine learning engineer at Terra AI (https://terraai.com), a venture backed startup in the domain of AI-based modeling and reasoning for mineral discovery and development, where I work with geologists, geophysicists, and engineers to develop deep learning and uncertainty-aware geophysical modeling algorithms that help our clients working in mineral exploration & development domain to optimize and speed up their operations while greatly minimizing involved risks. 

I Previously worked as a Research and Development Scientist at UtopiaCompression Corporation to develop intelligent vision systems for object detection, classification and tracking using concepts of machine learning, computer vision and uncertainty quantification. I worked on the company’s sense and avoid (SAA) platform: a lightweight vision-based package for unmanned aircraft systems to navigate in the presence of cooperative and non-cooperative aircraft (for use in both military and civilian theaters) using detection, classification and tracking capabilities. My work focuses on developing uncertainty quantification frameworks to predict failure probability of AI based systems used in the SAA pipeline. In other words, my goal is to quantify how much the user can trust the results of already trained and deployed AI models.

I completed my PhD in machine learning and uncertainty quantification at the Computational NeuroEngineering Lab (CNEL), University of Florida, under the advisement of Dr. Jose C. Principe. I worked at the intersection of machine learning, kernel methods and information theory. My PhD research work specifically involved developing physics inspired frameworks for quantifying predictive uncertainty in learning models and data.  This has a wide varied set of practical applications that include high-speed and precise quantification of uncertainty (likeliness of error) of a trained model's predictions, quantification of transferability of data and models, adversarial attack detection and applications related to time-series data analysis such as anomaly detection, clustering and dependence quantification. I have focused more on uncertainty quantification of deep neural networks with applications that include detection of model classification errors during test-set distributional shift and quantification of semantic segmentation uncertainty (for autonomous vision). As part of my initial endeavor in the lab, I have also worked with novel methods of kernel adaptive filtering. I also have experience with new linear dynamical systems for applications such as speech phoneme classification, dynamic textures synthesis and action sequence segmentation and. Scroll down to learn more about my work.

I'm currently seeking a full-time research/engienering role in machine learning related to computer vision or time-series applications with a preference towards areas related to uncertainty in AI, AI safety, physics inspired AI, AI interpretability, anomaly/attack detection and time-series analysis.

PhD RESEARCH OVERVIEW

Physics inspired Functional Operator for Model Uncertainty Quantification in the RKHS: The QIPF

Proposed approach for neural network uncertainty quantification: (1) RKHS projection of model weights (ψw(.)) creates a potential field quantifying its PDF. (2) Functional operators act on ψw(.) to create a multi-moment uncertainty landscape/function which when evaluated on a prediction quantifies its uncertainty.

Quantifying prediction uncertainty of modern neural network models has become an important endeavor in machine learning research, especially in safety critical applications like healthcare, defence systems and autonomous driving. Existing Bayesian methods, being highly iterative, are expensive to implement and often fail to accurately capture a model’s true posterior because of their tendency to select only central moments. We propose a fast single-shot uncertainty quantification framework where, instead of working with the conventional Bayesian definition of model weight probability density function (PDF), we utilize functional operators (inspired by physics) over the projection of model weights in a reproducing kernel Hilbert space (RKHS) to quantify their uncertainty at each model output. The RKHS projection of model weights yields a potential field based interpretation of model weight PDF which consequently allows the definition of a functional operator, inspired by perturbation theory in physics, that performs a moment decomposition of the model weight PDF (the potential field) at a specific model output to quantify its uncertainty. We call this representation of the model weight PDF as the quantum information potential field (QIPF) of the weights. The extracted moments from this approach automatically decompose the weight PDF in the local neighborhood of the specified model output and determine, with great sensitivity, the local heterogeneity of the weight PDF around a given prediction. These moments therefore provide sharper estimates of predictive uncertainty than central stochastic moments of Bayesian methods. Experiments evaluating the error detection capability of different uncertainty quantification methods on covariate shifted test data show our approach to be more precise and better calibrated than baseline methods, while being faster to compute.

Model uncertainty quantified by the QIPF in a regression problem (white regions: seen data regions, pink regions: unseen data regions. Quantified uncertainty can be observed to be very sensitive in discriminating between seen and unseen regions, while also being able to quantify sample uncertainty within the training region. 

Model uncertainty quantified by the QIPF in a classification problem (noise added two-moons dataset). Quantified uncertainty can be seen to be very precise showing low uncertainty at regions with data samples present and high uncertainties at the decision boundary and regions outside of the samples.

Error detection ROC curves of different uncertainty quantification methods implemented on a LeNet model trained over MNIST dataset. QIPF can be seen to be more accurate than other methods in detecting model classification errors.


Quantifying Uncertainty of Predicted Semantic Scene Segments (Autonomous Vision Application)

Following is a summary of our latest ongoing work where we use the QIPF framework to quantify the uncertainty of the predicted segments of road scene frames by a neural network (FCN-8) trained on the CamVid dataset. The following video shows the segment-wise uncertainty quantified by the QIPF to be highly correlated with the errors made by the trained model in segmenting the scenes. Prelim. results show QIPF to be a much better uncertainty quantifier than baseline methods. More details and open-source code to be released soon!

*Results shown above are for a random subset of the test-set. Experiments ongoing.

Quantifying  Transfer  Learning  Uncertainty

Proposed approach: Moments extracted from the local interaction of the fine-tuned layer weights with the RKHS potential field created by source layer weights quantify the overall uncertainty in modeling target dataset.

Transfer learning is an effective mechanism to adapt or fine-tune pre-trained models on new datasets instead of training models from scratch. Here it becomes crucial to determine where the mechanism will work and where it won't since the source and target datasets can possibly have completely different intrinsic patterns. Uncertainty quantification therefore becomes important in such cases. Our idea is to extend the QIPF framework to quantify the total uncertainty of the weights of fine-tuned layer with respect the weights of the pre-trained layers to determine how well the tuned model will perform on the new dataset. Specifically, we evaluate the weights of the fine-tuned layers in the RKHS potential field created by source layer weights, which gives the discrepancy between source and target datasets. More work that directly implements QIPF on datasets rather than models to determine transferability is currently ongoing.

A VGG-16 network was pre-trained on ImageNet dataset finetuned on kuzushiji-MNIST data. Uncertainty was quantified by decomposing QIPF of finutuned layer weights in the field created by pretrained layer weights. Left - depiction of model architecture and method for quantifying performance of UQ techniques. Right - AUROC results of MC-dropout and QIPF in test-set error detection. QIPF is seen to outperform MC-dropout in detection of prediction errors.

We trained a 1-D Fully Convolutional Network (FCN) architecture on UCI times series classification datasets. Transfer learning was implemented using different UCI datasets as source and the rest of datasets as targets. QIPF was implemented on the network (at each source-target example) for quantifying overall predictive uncertainties of the network on the test-sets of the target dataset (after pre-training and fine-tuning). The QIPF uncertainties for each source-target pair (right) can be seen to be highly correlated with the corresponding model test-set prediction errors (left), thus demonstrating the capability of QIPF in detecting a false model prediciton.

Uncertainty  based  Anomaly Detection

Proposed approach: We use the QIPF framework to extract dynamical (time-varying) uncertainty moments from a time-series on a sample-by-sample basis. (1) RKHS projection of samplesx(.)) creates a potential field quantifying signal PDF at a particular time. (2) Functional operators act on ψx(.) to create a multi-moment uncertainty landscape/function which when evaluated on a new sample quantifies its dynamical moments with respect to the signal PDF at that time.

A central problem associated with the analysis of real world time-series datasets is that they are often non-stationary or characterized by time-varying statistical properties which makes it very challenging for current machine learning and information theoretic methods to process. Our conjecture is that a possible approach to characterize non-stationary features of a time series signal more effectively is to use a dynamic embedding space that automatically and sensitively adapts its local structure based on the evolution of the signal. To this end, we propose to utilize the QIPF framework which provides a completely data-adaptive and multi-moment uncertainty representation of a signal and is consequently able to quantify the local dynamics at each point in the sample space in an unsupervised manner with high sensitivity and specificity with repect ot the overall signal PDF. Through the use of the QIPF, we utilize concepts of quantum physics (which provides a principled quantification of particle-particle dynamics in a physical system) to interpret data. Consequently we introduce a new energy based information theoretic formulation to accomplish pattern recognition tasks associated with time series data that quantifies sample-by-sample dynamics of the signal (important in online time series analysis, which is not achievable by conventional methods).  We specifically explore applications like anomaly detection and clustering.

Change Point Detection: 1000 samples of a drift dataset (top left) where black vertical lines mark positions of change points (where drift occurs), standard deviation of 10 extracted QIPF modes measured at each point (bottom left) and corresponding the ROC curves (right) of different methods in detecting change points of the dataset. QIPF can be seen to outperform Bayesian method in detecting change points in time-series. 

Time-series Dependency Quantification using QIPF and Optimal Transport

Accurate and domain-specific dependence measures between time-series has always been an important research problem with many applications, notably in finance related areas. Conventionally used correlation measures (Pearson correlation for instance) have many disadvantages such as inability to effectively capture non-linear dependencies and lack of robustness to noise and monotone transformations. Mutual information and copulas, on the other hand, only capture the strength of dependence and are inadequate in separating/classifying the types of dependence, which is critical in many application domains. For instance, when evaluating dependencies in stock market data, tail-dependence is given much higher priority than others since one is more interested in knowing whether two stock market variables are correlated at their respective extreme values than at their means. Hence there is a need to come up with measures that are both robust-equitable, i.e. able to efficiently quantify non-linear dependence and be robust to noise/transformations, and also decomposable, i.e. able to cluster different types of dependencies specific to a problem domain.

To this end, we propose a decomposable robust-equitable dependence measure called the QIPF-Optimal Transport (QIPF-OT). The idea here is to first use the QIPF uncertainty framework to quantify the cross information potentials ψY(X) and ψx (Y ), representing the marginals of X and Y, and decompose them into modes [ψY0 (X ), ψY1 (X ), ψY2 (X ), ...] and [ψX0 (Y ), ψX1 (Y ), ψX2 (Y ), ...] that intrinsically induce a geometric clustering of the marginals in terms of their different degrees of their heterogeneity (local gradient flow of their PDF). The next step is to use mode-constrained optimal transport leading to a transportation coupling map between the two mode-sets of X and Y . Dependence is then quantified by measuring how much the coupling map deviates from one-to-one correspondence of the mode sets.

Figure: Depiction of Approach

Hierarchical  Linear  Dynamical  System  for  Modeling  Time-Series

Proposed approach: (1) and (2) represent the state and observation equations respectively of a regular linear dynamical system. In HLDS, the states are split into multiple different layers, zt, ut and xt that are mutually coupled through parameter matrix F. The dimensionality decreases from the bottom layer to the top so that zt has the smallest dimension providing maximum contraint in the nested state space.

The Hierarchical Linear Dynamical System (HLDS) is a modified Kalman filter topology of linear dynamical system (LDS) where prior structural constraints are systematically imposed on the LDS by splitting the states into different layers and decreasing the dimensionality of the state variables from the bottom layer to the top. This induces unsupervised clustering of signal dynamics (being modeled by the LDS) at its top state layer (i.e. the most constrained state variable), thereby improving its ability to model complicated time-series data sequences because of an induced nesting of dynamics. It is first introduced in this paper. I made the following contributions in the development and application of this algorithm. 


1. Demonstrated ability of the HLDS in modeling layered dynamic textures (a complex multi-dimensional and multi-dynamical data sequence where one or more texture videos are imposed over another) when compared with LDS. Visit paper for details.


2. Introduced Correntropy cost function for the HLDS to further improve its localization ability of the signal in its constrained states. This made it possible for the HLDS (despite being a linear algorithm) to cluster different speech phonemes. Visit paper for details.


3. Deployed HLDS for video game action sequence segmentation as part of a preliminary downstream task in a DARPA funded project: “An Active Architecture for Self Learning”. Link to Project Abstract.

Superior performance of HLDS in synthesizing composite dynamic textures: (a) Original DT frame sequence, (b) Synthesized frames by LDS, (c) Synthesized frames by 2-layer HLDS, (d) Synthesized frames by 3-layer HLDS. HLDS models can be seen to synthesize visually better quality frames.

Improvement in speech phoneme clustering ability by introducing Correntropy in HLDS: Comparison of Average Mahalanobis Distance Between Cluster Representations in the Top State Layer in original (non-Correntropy) HLDS (left) and Correntropy HLDS (right). Correntropy HLDS can be seen to more effectively separate out different phoneme clusters from each other.

PUBLICATIONS

(paper link)



(paper link)





(paper link)



(paper link)



(paper link)



(paper link)



(paper link)



(paper link)



(paper link)



(paper link)



TALKS

Invited Talks:

YouTube Link

Contributed Talks:

Speaker: Jose C. Principe (Distinguished Professor, University of Florida)

Speaker: Shujian Yu (Associate Professor, UIT Arctic University of Norway)

Conference Presentations:

Talk Link

INDUSTRY  EXPERIENCE

Research & Development Scientist   -   UtopiaCompression Corporation  (January, 2023 - present)

Los Angeles, USA

UtopiaCompression Corporation is a company based in Los Angeles (USA) that develops state-of-the-art technology solutions for US government and defense agencies in the areas of computer vision, autonomous systems, unmanned aircraft Sense-and-Avoid, wireless communications, airborne networking, medical decision support systems and diagnostics.

I perform research and engineering tasks to develop the company's Sense-and-Avoid (SAA) platform: a lightweight vision-based package for unmanned aircraft systems to navigate in the presence of cooperative and non-cooperative aircraft (for use in both military and civilian theatres) using detection, classification and tracking capabilities. My work focuses on developing methods to quantify uncertainty in the results of deep learning (DL) models used in the SAA pipeline (that informs the user how much they can trust the results of deployed AI systems). My general responsibilities also include improving the company's current algorithms, writing, evaluating and testing code, and aiding technology transition into US government and commercial markets.

Company Website

Research Scientist Intern   -   Aventusoft LLC   (May, 2020 - August, 2020)

Boca Raton, Florida, USA

AventuSoft LLC is a research startup that develops medical devices for high-value cardiac assessments by analyzing heart valve movements

During the internship, I worked with the HEMOTAG device (link), the flagship product of Aventusoft for diagnosing and managing heart failure assessments. I developed deep learning algorithms for detecting fiducials points in electrocardiography signals as part of a downstream task for detecting arrhythmias and tested and validated it on benchmark public ECG datasets such as MIT-DB, European ST-T and PhysioNet. My work was incorporated into the product.

I also collaborated with the research team to learn more about the HEMOTAG platform and discussed and suggested future research work to improve the technology further. The internship also gave me a valuable opportunity to learn about the challenges involved in the initial phases of a product launch.

Company Information

Hemotag Product: https://www.hemotag.com/

Assistant Manager   -   Tata  Motors  Limited   (May, 2014 - May, 2016)

Pune, Maharashtra, India

I worked at the Commercial Vehicle Business Unit (CVBU) of Tata Motors (Pune plant). My job involved studying vehicle assembly line automation systems, suggesting improvements and carrying out major maintenance operations when necessary. I had made key technical improvements in several production automation systems with respect to safety, maintenance and productivity.

My performance in the company was rated in top 10% during the first year.

Company website

Summer Intern   -   NTPC  Limited   (May, 2012 - June, 2012)

Badarpur, Delhi, India

I was a summer intern at NTPC (National Thermal Power Corporation), India's largest power utility company, at its Delhi (Badarpur) plant where I learned about the various aspects of the power generation and distribution processes and the functioning and roles of various departments (Electrical maintenance, control and instrumentation).

Company website

ACADEMIC  POSITIONS

Research Assistant   -   University of Florida   (August, 2017 - Present)

Gainesville, Florida, USA

I work as a research assistant at the Computational NeuroEngineering Lab (University of Florida) where I develop novel uncertainty quantification methods based on information theory, kernel methods and concepts in quantum physics for various applications in signal processing and machine learning.

Grants: DARPA - FA9453-18-1-0039, ONR - N00014-21-1-2345


Teaching Assistant   -   University of Florida   (January, 2022 - Present)

Gainesville, Florida, USA

I am also working as a teaching assistant for Dr. Jose C. Principe this semester for the course: Machine Learning for Time Series.  The course covers topics such as theory of adaptation with stationary signals, performance measures, LMS, RLS algorithms, implementation issues and applications. My role includes clarifying concepts and questions posed by students, grading assignments, help develop curriculum and assist in delivering lectures.

UNDERGRADUATE  EXPERIENCE

TEAM OJAS  -   Formula Student Electric Car Project (2012 - 2013)

I was a part of the 40 member FS (Formula Student) team of my college, Team Ojas, which built a single seat electric car for the international student competition, Formula Student (FS), held at the Silverstone circuit, UK. I along with two others were responsible for the high voltage electrical systems of the vehicle. Among other things, our role mostly included the following:- 

UNDERGRADUATE  RESEARCH  PROJECTS


During my junior and senior years, I worked on two research projects involving electric machine systems (which was my interest at the time):

1. Dynamic braking of induction motor - Analysis of conventional methods and an efficient multistage braking model:

A detailed analysis was made on capacitor self excitation and DC injection methods of dynamic braking of induction motor through MATLAB/SIMULINK simulations. These conventional dynamic braking methods were carefully analysed while changing various parameters.Using the data obtained from the analysis, the parameters of a fast and reliable multi-stage dynamic braking model were concluded for the best possible braking performance. (paper link)


2. Novel Rotor Position Estimation For Switched Reluctance Motor Using Vibration Signature Analysis:

In this project, simulations were performed to demonstrate a novel rotor position estimation technique for switched reluctance motors where the vibration pulses being sensed on the stator frame was used as a signature to determine the rotor position. This removes the need for the conventionally used shaft position sensor inside the stator which involves additional costs and mechanical alignment problems. (paper link)

REVIEWER  SERVICE

I have served as a reviewer for the following venues: