Loading...

‘We think about Foundational AI models as way beyond large language models’

‘We think about Foundational AI models as way beyond large language models’
Loading...

As vice-president of IBM Research AI, Sriram Raghavan heads the American company’s artificial intelligence (AI) research labs. Until recently, he was director for IBM Research Lab in India and research centre in Singapore. In an interview, he shares IBM’s AI strategy, his thoughts on how to seek return on investment (ROI) from AI, the impact of advances in quantum computing on AI, and how AI needs an ethical framework. Edited excerpts:  

Why are CXOs of many companies across the globe and in India too, who have adopted AI, still struggling with the ROI?

The use cases are well understood. So, if you can get the AI model working and created with the right investment, the business impact is clear. But am I going to take six months? Am I going to continue to require 300 people to maintain the model? These are the ROI questions they (CXOs) are struggling with. This is the reason we are very excited about Foundation Models (large models like generative pre-trained transformer 3 or GPT 3) but we think about them way beyond just large language models. At the core of the foundation models is the following idea: Can I train a model to create a representation with zero human supervision —self-supervised? If yes, I’m only limited by the compute power and infrastructure to process all that data.

Loading...

Imagine I have to do 20 NLP (natural language programming) tasks including question answering, sentiment analysis, and extraction. The traditional approach to address this was to go from collecting and processing all your data in a model.

With Foundation Models, you are not limited by label data (such as “cat” or “dog”) because your model can be trained without it. I also need not start with raw data every time, so 20 AI models can be created with the same data set. I, thus, pay the cost of data curation engineering once as opposed to 20 times (hence better ROI). The challenge, though, is you must have the skills and the compute power to train these large AI models.

IBM has been talking about NLP, AI automation, Advanced AI, Scaling AI, and Trust AI as part of its overall AI approach. What do these terms mean for businesses?

Loading...

The focus on NLP and Trust at its core is a recognition that there is a science around creating trustworthy AI. Then there is the operationalization of trust. In an enterprise context, this can involve NLP to build conversational systems. NLP also allows us to extract insights that help us do IT automation. There is also the automation of AI —the application of AI to do business and IT automation.

A lot of people have to build AI models. How do we empower them to make it easier and faster to build the right ones?

This is sometimes referred to as Scaling AI. Underpinning all of these is the fact that we continue to think of our AI and hybrid cloud strategy as tightly coupled because we’re always building AI to run where the data is.

Loading...

Given the massive strides that AI has made over the last few years, do you think that we have reached the stage where a breakthrough in AI becoming sentient can happen anytime?

Unequivocally, AI is not sentient. We have continued to advance our ability to do pattern recognition and representation, smart representation at scale with these more recent advances—we are not only doing prediction and classification but are generating too, but we’re still data driven. Data representations today have become more powerful since they are learning from data. But we are far away from anything in AI that’s going to be called sentient.

Give us some examples of how AI is being automated?

Loading...

The use case that crosses industries and geographies, and which people find the easiest to get started with, is Conversation in AI for customer interaction. The second one is the application of AI to IT automation. The third is process automation or workflow automation or business automation.

We are also seeing a shift from task automation to task orchestration. Can AI go beyond task automation into tasks — how to do credit check; how to do citizenship check? Can it put together the flow knowing this is what you want to accomplish? That’s the vision behind orchestrator, and it will expand the scope of automation.

The work around network automation (is gaining traction) as telcos increasingly adopt new networks, 5G, etc., because of which they need more and more AI technology. The India research lab, as an example, was instrumental in some of the work we did globally with this network and there are 5G operations where they wanted to use AI to help figure out automated allocation of resources for 5G slicing (dividing the network into multiple virtual connections which can be tailored to the traffic requirements of different use), etc.

Loading...

I also see a huge opportunity for more and more AI to show up in sustainability, which is why IBM Research puts so much in our work with our business units to release the environmental intelligence suite.

Will advances in quantum computing dramatically speed up computation of AI models and result in breakthroughs that are inconceivable today?

I wouldn’t claim we have the answer today. But quantum has the promise to create representations of data that you cannot create classically (with traditional computers). It will detect structure in data that you cannot detect through classical mechanisms. Once we can exploit that, the marriage between quantum and AI we think will happen. Then you will use classical techniques to scale them up and run them. So, I view it more not as (increasing the) speed, but letting you do things you could not do at all.

Loading...

Academic institutions like IIT Delhi and IIT Bombay have partnered with IBM AI Horizons Network. What do you hope to achieve from these partnerships, and have you seen any tangible outcomes or are these early days?

There are three things with each of these partnerships. One is the talent pipeline, and we are working closely with the faculty and students. Many of them are the people we end up hiring back into the lab and working with. The second is in areas in which Foundational AI is at work in the public domain. As an example, out team has been working with IIT Bombay on what they call table retrieval (information in tables). AI can give you a straightforward answer. We also provide a network (to universities). We bring them together. There are prospecting collaborations, and it doesn’t mean that academics can’t collaborate with each other. But we sometimes become the trigger for cross-collaboration, which also enriches the work. So, there is now joint work across IIT Bombay, IIT Delhi and IBM research.
 
Talking about AI, as we see today, cannot be had with talking about ethics too. What is the IBM approach here?

There is the ethics of AI which goes beyond technology. That’s the intersection of technology with society policy, etc. The role that IBM plays there is as technologist offering our point of view and engaging with policy bodies, but the rules are created by lawmakers. Trustworthy AI is the second bucket, which is where we are creating the technology and techniques. That is where IBM Research has been heavily invested. Then there is what I call the operationalisation and governance of AI as the third bucket. This is where AI meets AI engineering when you deploy it.

Right now, getting explainable, reliable AI models deployed still requires a fair amount of skill and expertise. We don’t have enough people with the skill and expertise relative to how much AI people want to deploy. So, a lot of our focus is in making it easier, faster, and to automate a lot of these things.

The challenge is that our clients have hundreds and thousands of assets. We have to help them develop an AI model to do its job for those thousands of assets that also change over the years -- that is really when the trustworthy AI rubber meets the road. Our approach in IBM has always been to think of trust as a platform with toolkits for privacy, toolkits for a bias detection, toolkits for fairness, and for explainability. They are plug and play into the platform and depending on the use case that you are applying, you can use the platform, choose the right tools from those things and operationalize it.

In use cases and products, where we deploy AI models, we can bundle in lifecycle management, integrate ourselves. In scenarios where the customers create their own models, we are working to give them platforms like studio and open scale where they can help manage the life cycle models with more and more effort, so that is the journey that we are on.


Sign up for Newsletter

Select your Newsletter frequency