Home > Article > Technology peripherals > What LinkedIn learned from using large language models to serve its billion users
With more than 1 billion users worldwide, LinkedIn continues to challenge the limits of today’s enterprise technology. Few companies operate quite like LinkedIn, or have similarly vast data resources.
This business and employment-focused social media platform connects qualified candidates with potential employers, and helping fill job vacancies is its core business. It is also important to ensure that posts on the platform reflect the needs of employers and consumers. Under LinkedIn's model, these matching processes have always relied on technology.
By the summer of 2023, when GenAI was first gaining steam, LinkedIn began to consider whether to leverage large language models (LLMs) to match candidates with employers and make the flow of information more useful.
So the social media giant embarked on a GenAI journey and is now reporting the results of its experience leveraging Microsoft’s Azure OpenAI service. CIOs across all industries can learn some lessons from LinkedIn along the way. As most CIOs experience, adopting emerging technologies comes with trials and setbacks. The situation at LinkedIn is no different, and according to Juan Bottaro, the company's principal software engineer and head of technology, its road to LLM collaboration has been anything but smooth.
The initial wave of hype surrounding GenAI didn't help.
"LLM is something new, and it feels like it can solve all problems," Bottaro said. “We didn’t start out with a very clear idea of what LLM could do.”
For example, early versions of the improved job matching effort were pretty, to use a loose word, Rude. Or at least too literal.
"It's not practical to click 'Evaluate my suitability for this job' and get 'You're not a good fit at all,'" Bottaro said. "We want [responses] to be factually accurate but also empathetic. Some members may be considering a career change for which they are not currently well suited and need help understanding the gaps and what to do next."
So an important initial lesson learned at LinkedIn is to adjust LLM to meet audience expectations—and to help LLM understand how to respond in a way that may not be human, but at least human.
SPEED MATTER
Although LinkedIn has over a billion members, much of the job search functionality for LLM jobs that rely on LinkedIn was initially targeted at premium members, a relatively small group. (LinkedIn declined to say how many premium members it has.)
"I wouldn't say LLM is fast. I don't think speed is an advantage," he said.
Speed can be defined in many ways. While operationally LLM may not be as fast as hoped, Bottaro said the acceleration of the overall deployment process is astounding. "The superpower of this new technology is that you can create prototypes very quickly, somewhere between two and three months. Before this technology, that was not possible," he said.
When asked how long various aspects of the project would take without an LLM, Bottaro said some might not be completed at all, while other elements "could take several years."
As an example , Bottaro mentioned the part of the system aimed at understanding intent. Without LLM, this would have taken two to three months, but LLM mastered it in "less than a week."
Cost Considerations
One aspect Bottaro calls a "barrier" is cost. Likewise, cost means different things at different stages of a project, as LinkedIn's experience shows.
"Even if it's just for a few million members," Bottaro said, possibly hinting at the number of premium members, prices have soared. That's because LLM pricing - at least LinkedIn's licensing agreement with Microsoft (its LLM provider and parent company) - is based on usage, specifically the usage of input and output tokens.
Tarun Thummala, CEO of an AI vendor, explained in a LinkedIn post unrelated to the project that LLM’s input and output tokens are roughly equivalent to 0.75 words. LLM providers typically sell tokens by the thousands or millions. For example, Azure OpenAI used by LinkedIn charges $30 per 1 million 8K GPT-4 input tokens and $60 per 1 million 8K GPT-4 output tokens in the US East region.
Another feature goal LinkedIn has set for its projects is automated assessment. The evaluation of LLM in terms of accuracy, relevance, safety, and other concerns has always been a challenge. Leading organizations and LLM manufacturers have been trying to automate some work, but according to LinkedIn, this capability is "still a work in progress."
Without automated assessment, LinkedIn reports that “engineers can only rely on visual inspection of results and testing on a limited sample set, often with a delay of more than 1 day before the metrics are known.”
The company is building a model-based evaluator to help estimate key LLM metrics such as overall quality score, hallucination rate, coherence, and responsible AI violations. Doing so will speed up experiments, and while LinkedIn's engineers have had some success with hallucination detection, they're not done yet in this area, the company's engineers said.
Part of the challenges LinkedIn encounters with its job matching efforts come down to data quality issues on both sides: the employer and the potential employee.
LLM can only use the data provided to it, and sometimes job postings do not accurately or comprehensively describe the skills employers are seeking. On the other hand, some job seekers post poor resumes that do not effectively reflect their extensive experience in problem solving and other areas.
In this regard, Bottaro sees the potential for LLMs to help employers and potential employees. By improving writing, both employers and LinkedIn users benefit, as the company's Job Matching LLM is able to work more efficiently when data entry is of higher quality.
When dealing with such a large membership base, accuracy and relevance metrics can "give a false sense of comfort," Bottaro said. For example, if LLM "gets it right 90 percent of the time, that means 1 in 10 people will have a bad experience," he said.
What makes this deployment even more difficult is the extreme nuance and judgment involved in providing useful, helpful, and accurate answers.
"How do you define what is good and what is bad? We spent a lot of time working with linguists to develop guidance on how to provide comprehensive representation. We also did a lot of user research," Bottaro explain. "How do you train people to write the right response? How do you define the task, dictate what the response should look like? The product might try to be constructive or helpful. It doesn't try to assume too much, because that's where the illusion starts. We're very interested in responses We take great pride in our consistency.”
LinkedIn’s sheer scale creates another challenge for job matching. With a billion members, a job ad may receive hundreds or even thousands of responses within minutes of being posted. Many job seekers may not bother applying if they see that hundreds of people have already applied. This requires LLM to find matching members very quickly and respond before less qualified applicants submit materials. After that, whether members see the notification and respond in a timely manner remains a question.
On the employer’s side, the challenge is finding the most suitable candidates – not necessarily the ones who are quickest to respond. Some companies are reluctant to publish salary ranges, further complicating efforts on both sides because the most qualified candidates may not be interested in how much the position will pay. This is a problem that LLM cannot solve.
LinkedIn’s vast database contains a lot of unique information about individuals, employers, skills, and courses, but its LLMs haven’t been trained on this data. Therefore, according to LinkedIn engineers, they are currently unable to use these assets for any inferencing or response-generating activities due to how these assets are stored and served.
Here, Retrieval Augmented Generation (RAG) is a typical solution. By building pipelines to internal APIs, enterprises can "enhance" LLM prompts with additional context to better guide and constrain LLM's responses. Most of LinkedIn's data is exposed through the RPC API, which company engineers say is "convenient for humans to call programmatically" but "is not LLM friendly."
To solve this problem, LinkedIn engineers "wrapped skills" around its API, giving them an "LLM-friendly description of what the API does and when to use it," along with configuration details, inputs and outputs schema and all the logic needed to map the LLM version of each API to its underlying (actual) RPC version.
LinkedIn engineers wrote in a statement: “Skills like this enable LLM to perform a variety of actions related to our products, such as viewing profiles, searching for articles/people/jobs/companies, and even Querying internal analytics systems. “The same technology is also used to call non-LinkedIn APIs such as Bing search and news.” This approach not only improves LLM’s functionality but also enhances its integration with existing technologies. The ability to integrate infrastructure enables LLM to be more widely used in all aspects of the enterprise.
The above is the detailed content of What LinkedIn learned from using large language models to serve its billion users. For more information, please follow other related articles on the PHP Chinese website!