[FRIAM] Language Model Understanding

Sat Oct 7 20:36:30 EDT 2023

Tom -

This is probably a little less relevant to the business of journalism 
than you deserve but this is a pretty good (IMO) fleshing out of some of 
the technical details of (L)LMs which might fill in some blanks:

    https://developers.google.com/machine-learning/resources/intro-llms

There must be (by now) some (yet) more relevant resources for 
Journalists (for example), not only to de-Jargonize a little but also to 
make it (yet more) relevant to your domain.

Here, straight from the /Horses' Mouth/ (GPT 3.5) itself ( I leave it to 
you to tell us how well these answers work for you?).  It is probably 
worth engaging with chatGPT yourself and asking more and more focused 
questions...   a good self-study in "prompt engineering"!

    http://chat.openai.com

***

    /Introduction to Modern Language Models for Veteran Journalists//
    //
    //In today's digital age, the field of journalism is undergoing a
    significant transformation, largely driven by advancements in
    technology. One key innovation that has had a profound impact on the
    industry is the development of modern language models, often
    abbreviated as LLMs. While you may not be a tech expert,
    understanding the basics of LLMs can be invaluable for journalists.
    So, let's break it down in simple terms.//
    //
    //**What Are Language Models?**//
    //
    //At its core, a language model is like a super-smart digital
    writer. It's a computer program trained to understand and generate
    human language. Think of it as a tool that helps computers
    understand and use words, sentences, and paragraphs just like humans
    do.//
    //
    //**The Power of Large Language Models (LLMs)**//
    //
    //Now, let's talk about Large Language Models (LLMs). These are the
    big players in the world of language models. They're like language
    supercomputers because they're trained on massive amounts of text
    from the internet. This means they've learned a lot about how people
    write and talk.//
    //
    //**How LLMs Help Journalists**//
    //
    //1. **Content Generation**: LLMs can generate written content, from
    news articles to summaries and even creative pieces. Journalists can
    use them to draft stories faster and get ideas flowing.//
    //
    //2. **Fact-Checking**: They're handy for fact-checking, helping you
    quickly verify information by scanning a vast database of knowledge.//
    //
    //3. **Summarization**: LLMs can take long, complex documents and
    create concise summaries, saving you time on research.//
    //
    //4. **Translation**: They can also translate text between
    languages, which can be incredibly useful for international reporting.//
    //
    //5. **Personalization**: Some news websites use LLMs to tailor
    content to individual readers, providing a more personalized news
    experience.//
    //
    //6. **Social Media Analysis**: LLMs can sift through social media
    data, helping journalists understand public sentiment and trends.//
    //
    //**Potential Concerns**//
    //
    //While LLMs are incredibly powerful, there are some concerns to
    keep in mind. They might unintentionally spread misinformation if
    not used carefully, and they can't replace human judgment or
    critical thinking.//
    //
    //**Conclusion**//
    //
    //In a nutshell, modern language models (LLMs) are like digital
    writing assistants that can help journalists write better, research
    faster, and adapt to the changing landscape of digital journalism.
    While they're powerful tools, it's important to use them
    responsibly, keeping in mind their strengths and limitations, as
    well as the ethical considerations of automated content creation in
    journalism.//
    /

Can you elaborate on some of the technical jargon?/
/

    /Certainly! Here are some common technical jargon related to Large
    Language Models (LLMs) explained in journalist-friendly terms://
    //
    //1. **Pretraining**: This is the initial phase where the LLM learns
    from a massive amount of text data on the internet. Think of it as
    the model absorbing a vast library of information to understand
    language better.//
    //
    //2. **Fine-tuning**: After pretraining, LLMs undergo fine-tuning to
    specialize them for specific tasks. It's like giving the model extra
    training in a particular subject, like finance or healthcare, to
    make it more useful for specialized reporting.//
    //
    //3. **Token**: In LLMs, text is broken down into smaller units
    called tokens. A token can be as short as one character or as long
    as one word. Knowing the number of tokens is important because it
    affects how much text the model can handle at once.//
    //
    //4. **Inference**: This term refers to the process of using a
    trained LLM to generate text or make predictions. Journalists use
    inference to create content, answer questions, or analyze data.//
    //
    //5. **Prompt**: A prompt is the input you provide to an LLM to get
    a response. It's like giving the model a question or a starting
    point for it to generate text. Journalists often craft prompts to
    get specific information or insights.//
    //
    //6. **Bias**: LLMs can sometimes produce biased or unfair content
    because they've learned from biased internet data. Journalists
    should be aware of this and critically evaluate the model's outputs
    for any biases.//
    //
    //7. **Algorithm**: This is a fancy word for a set of rules or
    instructions that an LLM follows to perform tasks. Algorithms guide
    how the model generates text and makes decisions.//
    //
    //8. **NLP (Natural Language Processing)**: NLP is the field of
    study that focuses on teaching computers to understand and generate
    human language. LLMs are a product of NLP research.//
    //
    //9. **ML (Machine Learning)**: LLMs are a type of machine learning
    model. Machine learning is a way to make computers learn from data
    and improve their performance over time.//
    //
    //10. **Training Data**: This is the text data used to teach an LLM.
    It's like the model's textbooks, and the quality and diversity of
    this data influence how well the LLM performs.//
    //
    //11. **Bias Mitigation**: This is the process of reducing bias in
    LLMs. It's important in journalism to ensure that the information
    provided by the model is fair and unbiased.//
    //
    //12. **Ethical AI**: This refers to the practice of using LLMs
    responsibly and ethically. Journalists should consider the ethical
    implications of using these tools, especially when it comes to
    issues like privacy and misinformation.//
    //
    //13. **API (Application Programming Interface)**: Some LLMs offer
    APIs that allow journalists to interact with the model through code,
    enabling custom applications and integrations.//
    //
    //14. **Fine-Tuned Models**: These are LLMs that have been trained
    for specific tasks, like summarization, translation, or
    question-answering. They are more specialized and can be
    particularly useful for journalism.//
    //
    //Understanding these terms can help journalists effectively
    leverage LLMs in their work while also being mindful of their
    limitations and ethical considerations./

- Steve

On 10/7/23 5:13 PM, Tom Johnson wrote:

> Thanks. Those are impressive numbers.
> Tom
>
> On Sat, Oct 7, 2023 at 4:11 PM Marcus Daniels <marcus at snoutfarm.com> 
> wrote:
>
>     The “large” refers to the number of parameters used.  A smaller
>     large language model – a deep neural net -- start about 3 billion
>     parameters, but larger ones like Claude 2 (the latest large
>     language model of the company that wrote the paper Steve
>     mentioned) have more than 130 billion parameters.   Amazingly, it
>     is possible using (rooms of) GPUs and other accelerators to
>     optimize in this a space of this size.   The billions of
>     parameters come from the vocabulary size – the number of tokens
>     that need to be discriminated, the many layers of transformers
>     that are needed to capture the complexity of human and non-human
>     languages (like DNA), and the context window size – how many
>     paragraphs or pages the model is trained on at a time.   A small
>     language model might be suitable for understanding the geometries
>     of chemicals, say.
>
>     *From:* Friam <friam-bounces at redfish.com> *On Behalf Of *Tom Johnson
>     *Sent:* Saturday, October 7, 2023 2:38 PM
>     *To:* The Friday Morning Applied Complexity Coffee Group
>     <friam at redfish.com>
>     *Subject:* Re: [FRIAM] Language Model Understanding
>
>     Thanks for passing this along, Steve. I wish, however, the authors
>     of this short piece would have included a definition of, in their
>     usage, "Large Language Models" and "Small Language Models." 
>     Perhaps I can find those in the larger paper.
>
>     Tom
>
>     On Sat, Oct 7, 2023 at 12:34 PM Steve Smith <sasmyth at swcp.com> wrote:
>
>         This popular-press article came through my Google News feed
>         recently which I thought might be useful to the
>         Journalists/English-Majors on the list to help understand how
>         LLMs work, etc.   When I read it in detail (forwarded from my
>         TS (TinyScreenPhone) on my LS (Large Screen Laptop)) I found
>         it a bit more detailed and technical than I'd expected, but
>         nevertheless rewarding and possibly offering some traction to
>         Journalism/English majors as well as those with a larger
>         investment in the CS/Math implied.
>
>             Decomposing Language Models into Understandable Components
>             <https://www.anthropic.com/index/decomposing-language-models-into-understandable-components>
>
>         and the (more) technical paper behind the article
>
>             https://transformer-circuits.pub/2023/monosemantic-features/index.html
>             <https://transformer-circuits.pub/2023/monosemantic-features/index.html>
>
>         Despite having sent a few dogs into vaguely similar scuffles
>         in my careen(r):
>
>             Faceted Ontologies for Pre Incident Indicator Analysis
>             <https://apps.dtic.mil/sti/tr/pdf/ADA588086.pdf>
>             SpindleViz
>             <https://www.ehu.eus/ccwintco/uploads/c/c6/HAIS2010_925.pdf>
>             ...
>
>         ... I admit to finding this both intriguing and well over my
>         head on casual inspection...  the (metaphorical?) keywords
>         that drew me in  most strongly included /Superposition/ and
>         /Thought Vectors/, though they are (nod to Glen) probably
>         riddled (heaped, overflowing, bursting, bloated ... ) with
>         excess meaning.
>
>         https://gabgoh.github.io/ThoughtVectors/
>
>         This leads me (surprise!) to an open ended discursive series
>         of thoughts probably better left for a separate posting
>         (probably rendered in a semasiographic language like Heptapod
>         B
>         <https://en.wikipedia.org/wiki/Heptapod_languages#Orthography>).
>
>         <must... stop... now... >
>
>         - Steve
>
>         -. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. --- -.. .
>         FRIAM Applied Complexity Group listserv
>         Fridays 9a-12p Friday St. Johns Cafe   /  Thursdays 9a-12p
>         Zoom https://bit.ly/virtualfriam
>         to (un)subscribe
>         http://redfish.com/mailman/listinfo/friam_redfish.com
>         FRIAM-COMIC http://friam-comic.blogspot.com/
>         archives:  5/2017 thru present
>         https://redfish.com/pipermail/friam_redfish.com/
>           1/2003 thru 6/2021 http://friam.383.s1.nabble.com/
>
>     -. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. --- -.. .
>     FRIAM Applied Complexity Group listserv
>     Fridays 9a-12p Friday St. Johns Cafe   /   Thursdays 9a-12p Zoom
>     https://bit.ly/virtualfriam
>     to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com
>     FRIAM-COMIC http://friam-comic.blogspot.com/
>     archives:  5/2017 thru present
>     https://redfish.com/pipermail/friam_redfish.com/
>       1/2003 thru 6/2021 http://friam.383.s1.nabble.com/
>
>
> -. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. --- -.. .
> FRIAM Applied Complexity Group listserv
> Fridays 9a-12p Friday St. Johns Cafe   /   Thursdays 9a-12p Zoomhttps://bit.ly/virtualfriam
> to (un)subscribehttp://redfish.com/mailman/listinfo/friam_redfish.com
> FRIAM-COMIChttp://friam-comic.blogspot.com/
> archives:  5/2017 thru presenthttps://redfish.com/pipermail/friam_redfish.com/
>    1/2003 thru 6/2021http://friam.383.s1.nabble.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://redfish.com/pipermail/friam_redfish.com/attachments/20231007/2179732f/attachment.html>