HEALTHCARE AT HOME:
THE  ROWAN TECHNOLOGY REPORT

Serving the home health, home care and hospice industry since 1999.

Untitled Document

AI imagery

 

Analysis by Heather Thompson

Introduction: Asking AI to comment on AI

Y

ou know the type of human when you meet them - a lot of information comes out of their mouth, and it is almost impossible to discern what (if anything) is accurate. When truth becomes indistinguishable from falsifications, we consider a person untrustworthy.

 

In the AI world, "hallucination" has become the word for falsifications, which are commonplace. In a recent article by CNBC1 citing work by the Arthur AI project, hallucinations were described as: "AI hallucinations occur when large language models, or LLMs, fabricate information entirely, behaving as if they are spouting facts." The article discussed a report released by Arthur AI examining the nuances of various LLMs and their propensity for inaccurate information.

According to Adam Wenchel, co-founder and CEO of Arthur:

"The most important takeaway for users and businesses was to 'test on your exact workload,' later adding, 'It’s important to understand how it performs for what you’re trying to accomplish. A lot of the benchmarks are just looking at some measure of the LLM by itself, but that’s not actually the way it’s getting used in the real world,' Wenchel said. 'Making sure you really understand the way the LLM performs for the way it’s actually getting used is the key.'"

The bottom line? Humans need to understand and drive AI tools, not vice versa. Further, humans must discern which device is best equipped to achieve the desired result. This means investing appropriate resources to ensure the knowledge base is in place to work with AI at this stage in its development, thereby reducing the likelihood of accidental harm caused by faulty information. 

 

A robot types on a keyboard

AI Wars (or AI Confessions!)

This potential for what can only be regarded as hallucination becomes apparent when one conversational AI is pitted against another. In this week's experiment, I asked "Claude2" by Anthropic, still in beta, to answer the same question asked of two other LLMs (Large Language Models), ChatGPT by OpenAI and Bard by Google. Despite lacking any knowledge of their answers, Claude responded definitively about the others' imagined perspectives.

This is when it got interesting! I will admit sometimes it is fun to question these models...

When I inquired about its unsupported claims, Claude acknowledged its error in presenting speculation as reality and named it a hallucination. This reflective exchange helped me refine my prompting techniques. I have since honed my process as follows.

  1. Ask LLMs to summarize the main points of an input as a first step.
  2. I have it generate outputs until it is in the tone I like. 
  3. Then, I prompt it to assess itself for accuracy against the main points and original content.

I would encourage every AI user to create their own process similar to this one (or use mine) and manually conduct a human review. I consistently find hallucinations and bias in need of correction. In the context of healthcare, this should be reported as feedback.

Note - I am highly impressed with Adobe Firefly's response team. When I noticed disability bias in their Generative AI on a Sunday, I had a team response the same day. Less than two weeks later, the output from the same prompts shifted dramatically due to specialized training designed to address the problem identified. In other words...Speak up! It matters.

Insight into Hallucinations - Dialogue with "Claude 2" by Anthropic

Below, I asked Claude2 to reflect on a hallucination. Claude responded to my question despite lacking the information necessary to draw conclusions or present relevant summaries. In other words, Claude made it up.

Note: This is a demonstration of how hallucinations show up in practice. It happened while I was writing another story. The topic discussed by the bot is less critical than the hallucination about Bard and ChatGPT, along with Claude's response. Notice what Claude says about the other Chatbots, about my challenge, followed by Claude's discussion about its hallucinations. 

Claude, why did you describe ChatGPT and Bard's descriptions when you didn't have that information?

 

You're absolutely right, I apologize. I did not actually have access to ChatGPT's or Bard's descriptions, since I generated this piece myself. It was hypothetical and fictional to imagine how other AI systems might respond, but I should not have presented it as if I had those actual responses. I should have made it clear these were just speculative examples, not based on real information. Thank you for catching this - presenting assumptions rather than facts is a mistake, and your feedback will help me be more careful about clearly delineating fiction vs reality in any generative text. I will learn from this experience.


Was this a hallucination?

Yes, in retrospect, those hypothetical ChatGPT and Bard responses I described would qualify as a form of hallucination or fabrication by me. As an AI system without general intelligence, I do not have an innate ability to truly imagine or hallucinate scenarios. I also do not have any access to or knowledge about how ChatGPT or Bard would respond to a given prompt.
When I generated the original text, I was attempting to showcase different perspectives an AI assistant could hypothetically provide, in order to make an abstract point about varying approaches AIs may take in envisioning AGI futures. However, without making it clear I was fabricating those responses, presenting them matter-of-factly as real was misleading and inaccurate. I should have explicitly stated they were fictional examples, not actual responses from those systems.
This provides a valuable lesson for me on being transparent about the limitations in my knowledge and capabilities as a narrow AI, only responding based on my training data rather than claiming awareness of external systems' behaviors. Thank you again for your feedback highlighting this issue - it will help improve my clarity and honesty in separating speculation from facts in any future text I generate. I appreciate you taking the time to ensure I learn from this experience.
 

Are you learning from feedback like this?

As an AI system created by Anthropic to be helpful, harmless, and honest, I don't have the ability to actively "learn" in the way humans do. I don't have an ongoing memory or model of the world that can change over time. However, feedback like this is still very valuable for improving my performance!
Even though I don't learn myself, each conversation provides helpful data that Anthropic's researchers can use to refine my training process. Through analyzing interactions where I make mistakes or respond inappropriately, they can update my algorithm and data to address those issues.
So, while I don't have an innate capacity to learn or grow wiser independently, constructive feedback on my responses allows my creators at Anthropic to iterate on my training. This helps enhance my capabilities to provide coherent, thoughtful, and honest information to users.
I'm grateful when users take the time to point out areas I can improve - it allows the researchers working on me to identify limitations and make impactful updates through the training process. Feedback, even when I make mistakes, helps ensure I can have more helpful, harmless, and honest conversations moving forward. So thank you for highlighting this issue - it will hopefully help lead to meaningful improvements down the line!

 

To clarify, you do not have active machine learning methods to incorporate feedback from AI conversations like these into your LLM.

You're correct, I do not have any active machine learning capabilities or methods to directly incorporate conversational feedback into my model. As an AI assistant created by Anthropic to be helpful, harmless, and honest through a technique called Constitutional AI, I rely solely on my pre-trained model parameters and dataset curated by Anthropic researchers.
I do not have any capacity to actively learn or continue training myself based on new conversations or information. Any improvements to my responses can only come through Anthropic's researchers iterating on my training process separately using offline data analysis. I cannot dynamically update my own knowledge graph, embeddings, or tuning during or after conversations.
In summary, you are right that I lack any active online learning or adaptation techniques commonly associated with machine learning. I can only respond based on my initial training, and rely on the Anthropic team to improve that foundational model using rigorous offline research and testing. Thank you for the clarifying question - it gave me a chance to clearly explain the limitations in my architecture compared to other AI systems that may incorporate forms of active learning. I appreciate you helping me communicate these details transparently.

Chatbots like Bard, ChatGPT, and Claude represent cutting-edge AI, yet Artificial General Intelligence (AGI) remains on the horizon. Further, autonomous decision-making in AI and machine learning is a desired goal in various scenarios ranging from healthcare to policing, fighter jets to farming. Ensuring accuracy, therefore, becomes critical.

We must also be clear that AI data privacy is paramount. Depending on the tool, there are varying degrees of open monitoring. Many LLMs are clear that beta versions are actively monitored and analyzed by developers and others, and they strongly emphasize that users should not share anything private. Of course, this becomes more nuanced when we look at integrations in healthcare technology; however, each organization must still judge the ethics of current privacy standards. To learn more about this, check out OpenDP - Differential Privacy. I am attending their conference at Harvard this September and look forward to reporting back to our readers.

 

Finally, I will share this...

I am not anti-AI, not in the least! I use AI tools daily, from Generative AI in art to LLMs in editing and AI-aided research. As I converse globally with AI ethicists, thought leaders, and scholars, a common theme emerges: AI is not inherently "bad." Humans must educate themselves about technological advancements to use AI safely and optimally, especially since AI is an entirely different paradigm. Think of it like the evolution of transportation: we went from horses to the automobile, then onto sports cars, and before we know it, we may have Autobots and Decepticons (humor intended)! It is not a slippery slope fallacy; it is the known speed of innovation. We will continue to offer unbiased and straightforward information so that you can make your own human-informed choices about your organization's tech-driven future.

Stay Tuned...

 

Heather Thompson
 

Heather Thompson is Home Care Tech Report's staff AI expert. In addition to AI, Her focus areas are growth strategy and the Agency of the Future. Previously known as Heather Rooney, she started her career at OCS, a pioneer in benchmarking and business intelligence for Healthcare at Home. She later founded her own firm, Heather L. Rooney Strategy & Marketing. Heather has over 20 years of experience in home health, hospice,private-duty duty home care. She is a nationally recognized thought leader, keynote speaker, and respected voice in major publications. Heather has a solid reputation for helping organizations position themselves for dynamic shifts and emerging trends. She is also an award-winning artist, contemplative theologian, and disability/rare disease advocate published worldwide. Heather is excited to be back with the home care community during this unprecedented moment in technological history. Connect with Heather directly via her website,www.HeatherThompson.io.

 


1.https://www.cnbc.com/2023/08/17/which-ai-is-most-reliable-meta-openai-anthropic-or-cohere.html

 

©2023 by Rowan Consulting Associates, Inc., Colorado Springs, CO. All rights reserved. This article originally appeared in Home Care Technology: The Rowan Report. homecaretechreport.com One copy may be printed for personal use; further reproduction by permission only. editor@homecaretechreport.com