xennial innovations

September 28, 2023

Cracking the Code: Distinguishing Authentic AI from Imitations

Back to Insights

Maurice Yu

CTO

AI Growth

‍

Lately, every tech company is advertising the inclusion of Artificial Intelligence (AI) in their solutions and service offerings. With an estimated market industry of nearly 1 trillion USD (some reports easily double this) by 2029, it’s no surprise that everyone is embracing this technology. While this number may seem enticing, it’s important to be cautious when you see a solution boasting about its AI capabilities. Is it the real deal? Or is it a brute-force approach that pretends to be intelligent? There are many solutions that are simply automation, like RPA (Robotic Process Automation), task-based solutions, or other variations that can’t be considered true AI; perhaps at best incomplete.

‍

True AI-based solutions should at least be measured based on the following two categories:

Self-Learning
Models & Algorithms

‍

Self-Learning

‍

One of the benchmarks of the solution is whether it can learn on its own. This is the entire definition of AI, where systems can be left alone to perform tasks that require human ‘intelligence’. If a system cannot learn over time without manual intervention, it is a red flag. Let’s take a quick step backward because to say no intervention is a bit extreme. Fine-tuning a machine learning model is distinct from adding a conditional clause to a million-line IF/ELSE-IF code. The latter is a basic form of AI, demanding ongoing human input, akin to early rudimentary Expert Systems. Self-learning is essential as it encompasses unsupervised and reinforced learning, enabling the model to acquire new knowledge from fresh data without requiring manual coding, as seen in the prior example.

‍

Models & Algorithms

‍

The second aspect of true AI is the model, training data, and algorithms leveraged. The model’s accuracy is only as good as the training data supplied to it. The more data, the better the trained model is.

‍

Consider the nature of data processing itself. The volume of data processed doesn't inherently imply AI usage. For instance, is the solution merely loading and indexing large datasets for query search, without leveraging it for fine-tuning or embedding, as opposed to tasks like document tokenization and vectorization?

‍

Not all companies can afford to invest millions in developing their AI solutions, like custom Large Language Models, especially at the scale of OpenAI, Google, Microsoft, and other major players. Does this imply that your solution doesn't fall under the category of true AI? In the context of Large Language Models (LLMs), does your approach involve enhancing existing models by incorporating your own knowledge through embeddings and vector databases? From our perspective, this still qualifies as AI, as it efficiently leverages existing LLMs while actively contributing to augmenting the knowledge index with your own data through AI-related processes and tools.

‍

On the contrary, some solutions merely overlay their interfaces onto the OpenAI API without engaging in AI-related activities themselves. For example, there are many models that provide medical assessments based on the input of symptoms. A fake AI solution might borrow this pre-trained model and provide stunning visualizations and interpretations of the data. However, if there is new patient data, are they able to retrain the existing model or are they waiting for the newly trained model from another provider? This is not to say the company isn’t providing any value. In fact, they might excel at their ability to present information, but the solution simply leverages AI.

‍

Equally significant are the algorithms and data processing pipelines employed. If the company doesn’t train its own model, then it typically won’t be able to speak about how the model was introduced. To vet a solution’s AI, there are considerations on what the problem being solved is because different problems of AI will require different algorithms. Whether it's linear regression or Bayes for NLP, there are certain categories of algorithms used for different types of problems. We are of course not asking for a company to divulge their intellectual property, but if they can’t share some high-level aspects of their solution or the types of algorithms, then a further assessment might be required.

‍

Retrospect

Although certain concepts are labeled as non-AI, some serve as precursors to AI's evolution. Additionally, elements like RPA might be employed in larger solutions to link real-world processes and tasks after AI decisions. The key is to evaluate whether a solution or company's claim regarding their AI strategy holds merit, or if it's merely a case of AI washing, as some have termed it.

‍

References:

https://www.bloomberg.com/press-releases/2023-06-02/with-28-5-cagr-mobile-artificial-intelligence-ai-market-size-is-expected-to-reach-usd-880-28-bn-by-2029

https://www.techopedia.com/ai-washing-everything-you-need-to-know/2/34841

‍