<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>Tom -</p>
<p>This is probably a little less relevant to the business of
journalism than you deserve but this is a pretty good (IMO)
fleshing out of some of the technical details of (L)LMs which
might fill in some blanks:</p>
<blockquote>
<p><a moz-do-not-send="true"
href="https://developers.google.com/machine-learning/resources/intro-llms"
class="moz-txt-link-freetext">https://developers.google.com/machine-learning/resources/intro-llms</a></p>
</blockquote>
<p>There must be (by now) some (yet) more relevant resources for
Journalists (for example), not only to de-Jargonize a little but
also to make it (yet more) relevant to your domain. <br>
</p>
<p>Here, straight from the <i>Horses' Mouth</i> (GPT 3.5) itself (
I leave it to you to tell us how well these answers work for
you?). It is probably worth engaging with chatGPT yourself and
asking more and more focused questions... a good self-study in
"prompt engineering"!</p>
<blockquote>
<p><a moz-do-not-send="true" href="http://chat.openai.com"
class="moz-txt-link-freetext">http://chat.openai.com</a></p>
</blockquote>
<p>***<br>
</p>
<blockquote><i>Introduction to Modern Language Models for Veteran
Journalists</i><i><br>
</i><i><br>
</i><i>In today's digital age, the field of journalism is
undergoing a significant transformation, largely driven by
advancements in technology. One key innovation that has had a
profound impact on the industry is the development of modern
language models, often abbreviated as LLMs. While you may not be
a tech expert, understanding the basics of LLMs can be
invaluable for journalists. So, let's break it down in simple
terms.</i><i><br>
</i><i><br>
</i><i>**What Are Language Models?**</i><i><br>
</i><i><br>
</i><i>At its core, a language model is like a super-smart digital
writer. It's a computer program trained to understand and
generate human language. Think of it as a tool that helps
computers understand and use words, sentences, and paragraphs
just like humans do.</i><i><br>
</i><i><br>
</i><i>**The Power of Large Language Models (LLMs)**</i><i><br>
</i><i><br>
</i><i>Now, let's talk about Large Language Models (LLMs). These
are the big players in the world of language models. They're
like language supercomputers because they're trained on massive
amounts of text from the internet. This means they've learned a
lot about how people write and talk.</i><i><br>
</i><i><br>
</i><i>**How LLMs Help Journalists**</i><i><br>
</i><i><br>
</i><i>1. **Content Generation**: LLMs can generate written
content, from news articles to summaries and even creative
pieces. Journalists can use them to draft stories faster and get
ideas flowing.</i><i><br>
</i><i><br>
</i><i>2. **Fact-Checking**: They're handy for fact-checking,
helping you quickly verify information by scanning a vast
database of knowledge.</i><i><br>
</i><i><br>
</i><i>3. **Summarization**: LLMs can take long, complex documents
and create concise summaries, saving you time on research.</i><i><br>
</i><i><br>
</i><i>4. **Translation**: They can also translate text between
languages, which can be incredibly useful for international
reporting.</i><i><br>
</i><i><br>
</i><i>5. **Personalization**: Some news websites use LLMs to
tailor content to individual readers, providing a more
personalized news experience.</i><i><br>
</i><i><br>
</i><i>6. **Social Media Analysis**: LLMs can sift through social
media data, helping journalists understand public sentiment and
trends.</i><i><br>
</i><i><br>
</i><i>**Potential Concerns**</i><i><br>
</i><i><br>
</i><i>While LLMs are incredibly powerful, there are some concerns
to keep in mind. They might unintentionally spread
misinformation if not used carefully, and they can't replace
human judgment or critical thinking.</i><i><br>
</i><i><br>
</i><i>**Conclusion**</i><i><br>
</i><i><br>
</i><i>In a nutshell, modern language models (LLMs) are like
digital writing assistants that can help journalists write
better, research faster, and adapt to the changing landscape of
digital journalism. While they're powerful tools, it's important
to use them responsibly, keeping in mind their strengths and
limitations, as well as the ethical considerations of automated
content creation in journalism.</i><i><br>
</i></blockquote>
<p>Can you elaborate on some of the technical jargon?<i><br>
</i></p>
<blockquote>
<p><i>Certainly! Here are some common technical jargon related to
Large Language Models (LLMs) explained in journalist-friendly
terms:</i><i><br>
</i><i><br>
</i><i>1. **Pretraining**: This is the initial phase where the
LLM learns from a massive amount of text data on the internet.
Think of it as the model absorbing a vast library of
information to understand language better.</i><i><br>
</i><i><br>
</i><i>2. **Fine-tuning**: After pretraining, LLMs undergo
fine-tuning to specialize them for specific tasks. It's like
giving the model extra training in a particular subject, like
finance or healthcare, to make it more useful for specialized
reporting.</i><i><br>
</i><i><br>
</i><i>3. **Token**: In LLMs, text is broken down into smaller
units called tokens. A token can be as short as one character
or as long as one word. Knowing the number of tokens is
important because it affects how much text the model can
handle at once.</i><i><br>
</i><i><br>
</i><i>4. **Inference**: This term refers to the process of
using a trained LLM to generate text or make predictions.
Journalists use inference to create content, answer questions,
or analyze data.</i><i><br>
</i><i><br>
</i><i>5. **Prompt**: A prompt is the input you provide to an
LLM to get a response. It's like giving the model a question
or a starting point for it to generate text. Journalists often
craft prompts to get specific information or insights.</i><i><br>
</i><i><br>
</i><i>6. **Bias**: LLMs can sometimes produce biased or unfair
content because they've learned from biased internet data.
Journalists should be aware of this and critically evaluate
the model's outputs for any biases.</i><i><br>
</i><i><br>
</i><i>7. **Algorithm**: This is a fancy word for a set of rules
or instructions that an LLM follows to perform tasks.
Algorithms guide how the model generates text and makes
decisions.</i><i><br>
</i><i><br>
</i><i>8. **NLP (Natural Language Processing)**: NLP is the
field of study that focuses on teaching computers to
understand and generate human language. LLMs are a product of
NLP research.</i><i><br>
</i><i><br>
</i><i>9. **ML (Machine Learning)**: LLMs are a type of machine
learning model. Machine learning is a way to make computers
learn from data and improve their performance over time.</i><i><br>
</i><i><br>
</i><i>10. **Training Data**: This is the text data used to
teach an LLM. It's like the model's textbooks, and the quality
and diversity of this data influence how well the LLM
performs.</i><i><br>
</i><i><br>
</i><i>11. **Bias Mitigation**: This is the process of reducing
bias in LLMs. It's important in journalism to ensure that the
information provided by the model is fair and unbiased.</i><i><br>
</i><i><br>
</i><i>12. **Ethical AI**: This refers to the practice of using
LLMs responsibly and ethically. Journalists should consider
the ethical implications of using these tools, especially when
it comes to issues like privacy and misinformation.</i><i><br>
</i><i><br>
</i><i>13. **API (Application Programming Interface)**: Some
LLMs offer APIs that allow journalists to interact with the
model through code, enabling custom applications and
integrations.</i><i><br>
</i><i><br>
</i><i>14. **Fine-Tuned Models**: These are LLMs that have been
trained for specific tasks, like summarization, translation,
or question-answering. They are more specialized and can be
particularly useful for journalism.</i><i><br>
</i><i><br>
</i><i>Understanding these terms can help journalists
effectively leverage LLMs in their work while also being
mindful of their limitations and ethical considerations.</i><br>
</p>
</blockquote>
<p><br>
</p>
<p>- Steve<br>
</p>
<p>On 10/7/23 5:13 PM, Tom Johnson wrote:</p>
<blockquote type="cite"
cite="mid:CABg8s8JPLdcRwtfpyOumN_0FvXPYRWmh_668-=f_aqzKyAzU=w@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">Thanks. Those are impressive numbers.
<div>Tom</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Sat, Oct 7, 2023 at 4:11 PM
Marcus Daniels <<a href="mailto:marcus@snoutfarm.com"
moz-do-not-send="true" class="moz-txt-link-freetext">marcus@snoutfarm.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div class="msg7529318073659864475">
<div style="overflow-wrap: break-word;" lang="EN-US">
<div class="m_7529318073659864475WordSection1">
<p class="MsoNormal">The “large” refers to the number of
parameters used. A smaller large language model – a
deep neural net -- start about 3 billion parameters,
but larger ones like Claude 2 (the latest large
language model of the company that wrote the paper
Steve mentioned) have more than 130 billion
parameters. Amazingly, it is possible using (rooms
of) GPUs and other accelerators to optimize in this a
space of this size. The billions of parameters come
from the vocabulary size – the number of tokens that
need to be discriminated, the many layers of
transformers that are needed to capture the complexity
of human and non-human languages (like DNA), and the
context window size – how many paragraphs or pages the
model is trained on at a time. A small language
model might be suitable for understanding the
geometries of chemicals, say. </p>
<p class="MsoNormal"> </p>
<div
style="border-right:none;border-bottom:none;border-left:none;border-top:1pt
solid rgb(225,225,225);padding:3pt 0in 0in">
<p class="MsoNormal"><b>From:</b> Friam <<a
href="mailto:friam-bounces@redfish.com"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">friam-bounces@redfish.com</a>>
<b>On Behalf Of
</b>Tom Johnson<br>
<b>Sent:</b> Saturday, October 7, 2023 2:38 PM<br>
<b>To:</b> The Friday Morning Applied Complexity
Coffee Group <<a href="mailto:friam@redfish.com"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">friam@redfish.com</a>><br>
<b>Subject:</b> Re: [FRIAM] Language Model
Understanding</p>
</div>
<p class="MsoNormal"> </p>
<div>
<p class="MsoNormal">Thanks for passing this along,
Steve. I wish, however, the authors of this short
piece would have included a definition of, in their
usage, "Large Language Models" and "Small Language
Models." Perhaps I can find those in the larger
paper.</p>
<div>
<p class="MsoNormal">Tom</p>
</div>
</div>
<p class="MsoNormal"> </p>
<div>
<div>
<p class="MsoNormal">On Sat, Oct 7, 2023 at 12:34 PM
Steve Smith <<a href="mailto:sasmyth@swcp.com"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">sasmyth@swcp.com</a>>
wrote:</p>
</div>
<blockquote
style="border-top:none;border-right:none;border-bottom:none;border-left:1pt
solid rgb(204,204,204);padding:0in 0in 0in
6pt;margin-left:4.8pt;margin-right:0in">
<div>
<p>This popular-press article came through my
Google News feed recently which I thought might
be useful to the Journalists/English-Majors on
the list to help understand how LLMs work,
etc. When I read it in detail (forwarded from
my TS (TinyScreenPhone) on my LS (Large Screen
Laptop)) I found it a bit more detailed and
technical than I'd expected, but nevertheless
rewarding and possibly offering some traction to
Journalism/English majors as well as those with
a larger investment in the CS/Math implied.</p>
<blockquote
style="margin-top:5pt;margin-bottom:5pt">
<p><a
href="https://www.anthropic.com/index/decomposing-language-models-into-understandable-components"
target="_blank" moz-do-not-send="true">Decomposing
Language Models into Understandable
Components<br>
</a></p>
<blockquote
style="margin-top:5pt;margin-bottom:5pt">
<blockquote
style="margin-top:5pt;margin-bottom:5pt">
<blockquote
style="margin-top:5pt;margin-bottom:5pt">
<p class="MsoNormal"><img style="width:
2.4821in; height: 1.3869in;"
id="m_7529318073659864475_x0000_i1025"
src="https://efficient-manatee.transforms.svdcdn.com/production/images/Untitled-Artwork-11.png?w=2880&h=1620&auto=compress%2Cformat&fit=crop&dm=1696477668&s=d32264d5f5e32c79026b8e310e415c74"
moz-do-not-send="true" width="238"
height="133" border="0"></p>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
<p>and the (more) technical paper behind the
article</p>
<blockquote
style="margin-top:5pt;margin-bottom:5pt">
<p><a
href="https://transformer-circuits.pub/2023/monosemantic-features/index.html"
target="_blank" moz-do-not-send="true">https://transformer-circuits.pub/2023/monosemantic-features/index.html<br>
</a></p>
</blockquote>
<p class="MsoNormal">Despite having sent a few
dogs into vaguely similar scuffles in my
careen(r):</p>
<blockquote
style="margin-top:5pt;margin-bottom:5pt">
<p class="MsoNormal"><a
href="https://apps.dtic.mil/sti/tr/pdf/ADA588086.pdf"
target="_blank" moz-do-not-send="true">Faceted
Ontologies for Pre Incident Indicator
Analysis
</a><br>
<a
href="https://www.ehu.eus/ccwintco/uploads/c/c6/HAIS2010_925.pdf"
target="_blank" moz-do-not-send="true">SpindleViz</a><br>
...</p>
</blockquote>
<p>... I admit to finding this both intriguing and
well over my head on casual inspection... the
(metaphorical?) keywords that drew me in most
strongly included
<i>Superposition</i> and <i>Thought Vectors</i>,
though they are (nod to Glen) probably riddled
(heaped, overflowing, bursting, bloated ... )
with excess meaning.</p>
<p><a
href="https://gabgoh.github.io/ThoughtVectors/"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">https://gabgoh.github.io/ThoughtVectors/</a></p>
<p>This leads me (surprise!) to an open ended
discursive series of thoughts probably better
left for a separate posting (probably rendered
in a semasiographic language like
<a
href="https://en.wikipedia.org/wiki/Heptapod_languages#Orthography"
target="_blank" moz-do-not-send="true">
Heptapod B</a>). </p>
<p><must... stop... now... ></p>
<p>- Steve</p>
</div>
<p class="MsoNormal">-. --- - / ...- .- .-.. .. -..
/ -- --- .-. ... . / -.-. --- -.. .<br>
FRIAM Applied Complexity Group listserv<br>
Fridays 9a-12p Friday St. Johns Cafe /
Thursdays 9a-12p Zoom <a
href="https://bit.ly/virtualfriam"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">
https://bit.ly/virtualfriam</a><br>
to (un)subscribe <a
href="http://redfish.com/mailman/listinfo/friam_redfish.com"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">
http://redfish.com/mailman/listinfo/friam_redfish.com</a><br>
FRIAM-COMIC <a
href="http://friam-comic.blogspot.com/"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">http://friam-comic.blogspot.com/</a><br>
archives: 5/2017 thru present <a
href="https://redfish.com/pipermail/friam_redfish.com/"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">
https://redfish.com/pipermail/friam_redfish.com/</a><br>
1/2003 thru 6/2021 <a
href="http://friam.383.s1.nabble.com/"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">http://friam.383.s1.nabble.com/</a></p>
</blockquote>
</div>
</div>
</div>
-. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. ---
-.. .<br>
FRIAM Applied Complexity Group listserv<br>
Fridays 9a-12p Friday St. Johns Cafe / Thursdays 9a-12p
Zoom <a href="https://bit.ly/virtualfriam" rel="noreferrer"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">https://bit.ly/virtualfriam</a><br>
to (un)subscribe <a
href="http://redfish.com/mailman/listinfo/friam_redfish.com"
rel="noreferrer" target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">http://redfish.com/mailman/listinfo/friam_redfish.com</a><br>
FRIAM-COMIC <a href="http://friam-comic.blogspot.com/"
rel="noreferrer" target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">http://friam-comic.blogspot.com/</a><br>
archives: 5/2017 thru present <a
href="https://redfish.com/pipermail/friam_redfish.com/"
rel="noreferrer" target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">https://redfish.com/pipermail/friam_redfish.com/</a><br>
1/2003 thru 6/2021 <a
href="http://friam.383.s1.nabble.com/" rel="noreferrer"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">http://friam.383.s1.nabble.com/</a><br>
</div>
</blockquote>
</div>
<br>
<fieldset class="moz-mime-attachment-header"></fieldset>
<pre class="moz-quote-pre" wrap="">-. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. --- -.. .
FRIAM Applied Complexity Group listserv
Fridays 9a-12p Friday St. Johns Cafe / Thursdays 9a-12p Zoom <a class="moz-txt-link-freetext" href="https://bit.ly/virtualfriam">https://bit.ly/virtualfriam</a>
to (un)subscribe <a class="moz-txt-link-freetext" href="http://redfish.com/mailman/listinfo/friam_redfish.com">http://redfish.com/mailman/listinfo/friam_redfish.com</a>
FRIAM-COMIC <a class="moz-txt-link-freetext" href="http://friam-comic.blogspot.com/">http://friam-comic.blogspot.com/</a>
archives: 5/2017 thru present <a class="moz-txt-link-freetext" href="https://redfish.com/pipermail/friam_redfish.com/">https://redfish.com/pipermail/friam_redfish.com/</a>
1/2003 thru 6/2021 <a class="moz-txt-link-freetext" href="http://friam.383.s1.nabble.com/">http://friam.383.s1.nabble.com/</a>
</pre>
</blockquote>
</body>
</html>