SYNTHETIC DATA- RESEARCH SAVIOUR IN DATA STARVED AREAS

DailyPost 2638
SYNTHETIC DATA- RESEARCH SAVIOUR IN DATA STARVED AREAS

The fate of the research is all too known, in the search for original findings, which can add value our existence or to at least improve it incrementally. The harsh reality is the most of researchers are unaware of immense changes happening in the research ecosystem the world over, or how best can the newer digital assets be used for the purpose of research. In quite a few areas and topics there might not be use cases, but use cases can come by only if someone takes the bull by its horns and proves the worth of the new approach based on a novel knowledge asset.

The guide, researcher and the academic management are all in their own world in this country and are not ready to experiment with anything and still deliver world class research products and propel the country to becoming a research engine. Lip service cannot deliver anything, it has never. Today all data-based researches reach bottlenecks in no time; data not being available, could be provided for industry, proprietary reasons, government still not having gone ahead with an open data policy, and now also being plagued with privacy issues. The resources spread are too thin for any worthwhile contribution. For most of them it is a hit job to be performed for propelling their careers.

The integrity of data is another issue. The anonymization and pseudonymization of data under uncontrolled environment has many issues. Data as input for research has not been given its due, it is presumed that one solo researcher or few at best can be justice to this main input. The questionnaire methodology has been breaking at its seams for long time for a variety of reasons. The collation and processing are done in a near manual mode even today. Synthetic or artificially generated / created through the right process under most stringent guidelines and validation, through iterative methodology, with domain experts of eminence leading from the front can be a gamechanger.

If synthetic data can do wonders in training AI models, it can certainly become research’s poster boy. As it stands it can only happen in conjunction with real data. A data mapping is a must today to take a call whether a doctoral or post-doctoral research proposition would lead to a few original findings or not. The validation of real data is no mean task and it has to start from there. Otherwise, it is likely to vitiate the subsequent process of the creation of synthetic data. Domain experts would validate the real utility of synthetic data for a specific research purpose. One size fit all, can only lead to failure. Tech does not have a voice to defend itself. Synthetic data creation / usage as an input is a complex interdisciplinary task and can happen only at the level of a specific sector or even lower, depending on the complexity of that sector. Only then it becomes an objective and empirical research input.

SYNTHETIC DATA LABS CAN PROPEL RESEARCH TO A DIFFERENT TRAJECTORY.
Sanjay Sahay

Have a nice evening.

Leave a Comment Cancel Reply