AI for Scouting Reports, Game Predictions, and Analysis

#1

RetroVol

Well-Known Member
Joined
Jul 16, 2025
Messages
844
Likes
3,114
#1
I know there are at least a few forum participants who are interested in the way I've been using AI to generate Scouting Reports to extract predictions, determine the winner, analyze substitution patterns, etc. Others, I'm sure, have no interest whatsoever. I'm starting this thread as a place to describe what I'm doing, interesting things I run into, and for others to discuss both this project and the implications for the use of AI in general and, in particular, how the availability of AI may affect how we educate kids in the future. It may well be a good place for folks to question the use of AI, rail against it, etc., etc. Whatever. recognize this is a bit off topic, but, I will stay focused on my use of AI on Lady Vol topics. I think the best way to learn is to try and that discussions about specific attempts to use these tools are far better than generalities. Plus, maybe we'll come up with some additional feasible uses for Lady Vol topics. Anyway, may be a bit before I get the first substantive post up, assuming there isn't a mass demand that this thread get taken down, which, if there is, certainly won't hurt my feelings!

Regardless, I'll still generate the kind of content I've been posting and put it in the proper threads, unless, again, folks don't want it.
 
#3
#3
Shouldn't a properly established AI be able to more accurately predict outcomes/spreads?

AI, LLMs, etc are quite good at analyzing the past, but they struggle as much if not more than humans to predict the future. They struggle with nuance and gray areas. They are also built to be "helpful" because the designers figured an argumentative but truthful model will get less interaction than a "humble helpful" model, often this causes the model to provide answers that it thinks the asker wants to hear. This is why it is not a great option for therapy, investing, tax advice, etc.

These models should be thought of as junior level analysts capable of instantaneous but flawed work product. It is a tool for a true expert to refine and make themselves more insightful and productive. I think Retro does a good job of this, and there are several others that enjoy critiquing AI/LLM outputs.
 
#4
#4
I know there are at least a few forum participants who are interested in the way I've been using AI to generate Scouting Reports to extract predictions, determine the winner, analyze substitution patterns, etc. Others, I'm sure, have no interest whatsoever. I'm starting this thread as a place to describe what I'm doing, interesting things I run into, and for others to discuss both this project and the implications for the use of AI in general and, in particular, how the availability of AI may affect how we educate kids in the future. It may well be a good place for folks to question the use of AI, rail against it, etc., etc. Whatever. recognize this is a bit off topic, but, I will stay focused on my use of AI on Lady Vol topics. I think the best way to learn is to try and that discussions about specific attempts to use these tools are far better than generalities. Plus, maybe we'll come up with some additional feasible uses for Lady Vol topics. Anyway, may be a bit before I get the first substantive post up, assuming there isn't a mass demand that this thread get taken down, which, if there is, certainly won't hurt my feelings!

Regardless, I'll still generate the kind of content I've been posting and put it in the proper threads, unless, again, folks don't want it.
"We" want it. You have made a huge upgrade to the game thread!

Thanks for the time and effort you invest in generating these pre-game analyses.
 
#6
#6
It was interesting, literally minutes ago I received an unsolicited but not unwelcome Google notification of an AI pregame summary for LVs/Missouri. This is the first time I've received an AI version with analysis. It's fairly decent... and shareable

 
  • Like
Reactions: chuckiepoo
#10
#10
Just a brief note on the Texas Scouting report:

* I used Grok to scan X and got representative posts which I then put into ChatGPT asking it to keep it in its memory. Grok tried to paraphrase posts while still presenting them as quotes. Thankfully, it said that, and I re-prompted with explicit instructions against that. This is a huge no-no in journalism. Question for future investigation: If I asked Grok to be an experienced and highly ethical sports reporter, would it have made that mistake? Roles bring certain patterns in the LLM into play relevant to that role. Interesting to experiment with.

* I got a list of news articles from Gemini, then asked ChatGPT to summarize and keep in memory.

* I had ChatGPT go out and look up season player statistics and stats from last game and note deviations, then put these in memory. It didn't really use these. If I had chosen to manually t weak the Scouting Report, I'd have probably focused on Spearman for Tennessee. Both games, Texas vs. Vanderbilt and us vs Missouri were out-of-character for the season, so it's hard to know what to focus on, but Zee's disappearance in a blowout is noticeable.

*I had ChatGPT look at upcoming games, NET standings, etc. to put this game into perspective. Weirdly, it read the ESPN page wrong on Quad 1 standings saying that Tennessee was 0-2 rather than 5-5.

* I then prompted ChatGPT with "Based on this chat and pulling extra information as needed for the web, produce a pithy, punch, provocative post for volnation.com in bbcode containing key themes, information, and insights for the upcoming Tennessee vs. Texas women's basketball game on Sunday."

As it always seems to do, it appeared, from it's "thinking" log, that it went back and re-did part of the work it had done before, such as season records. But at least it didn't make the Quad 1 mistake again.

OVERALL THOUGHTS:


This process produced a better result, though not perfect. Having a knowledgeable human in the loop is still important.

The process could be refined and automated.
Tools like N8N allow users to design a workflow that breaks a task into chunks, uses the best AI for each chunk, then combines the results to produce a final product. However, it requires another monthly subscription and the use of API keys that I'm not interested in doing right now. But, if I were producing these for a job, I'd definitely move in this direction. If it's possible to highly automate writing a novel (no claims as to quality!), I'm sure I could make a better process than what I did today into a one-click effort, then just read the result and do final edits to produce a report that would combine AI capabilities and human knowledge, judgment, and overall understanding of the world.
 
#11
#11
Analyzing Substitution Patterns
Although more technical than the Scouting Reports where I just talk to the Chatbots, these reports have become very easy. Basically, the steps have been:

Originally, I have a Claude $20/month subscription and set up to use Claude Code (https://claude.ai/code). As I started I learned I needed a github account. I had heard of it, but never used it -- I'm not a programmer. So, I kept Claude Code in one window and opened the Claude Chatbot in another and asked it how to do that. ClaudeChat makes it VERY easy to take a screen shot of what you're doing in a window and then ask questions about it. So, it basically walked me through. Every time I got stuck, I took a screen shot and basically asked, "What do I do now?" I also had to learn to open Powershell (I work in Windows) in the file folder where I wanted to keep the data.

Eventually, Cluade Code generated Python Code that was stored on my system that I could run from the file on my desktop to go out to ESPN and pull the play-by-play analysis for first the 2024-2025 season and then for the 202502026 season. I re-run the latter after every Tennessee game. It's the work of a few seconds for me and a minute or two for the python code.

One of the things that operation does is produce a "json" file (and, no, I don't really know what that is) with all the plays in it. I've got a chat going in the Claude Chatbot about analyzing substitution patterns. I just upload the most current json file and ask for what I want, and for it to output it in bbcode so I can display the table in this forum. Today, that inluded revising the format to put the substitutions in order by the time stamp. All that required was asking, "Can you split out the substitutions and list them in order by time?"

I'm retired and just playing. For part of my working life, I ran an organization that had access to a lot of a data, and I did a lot to both improve the function of that organization and put the decision-making body I worked for in a position to make better decisions by running Microsoft Access queries against our SQL database. That sounds more technical than it really is. Today? I could do way more, way faster, and way better with these tools, and I still wouldn't know how to code.
 
#12
#12
See the post above about the step-by-step, bring-in-results from other AIs process I went through for Texas. I didn't think the result was worth the effort. So, for the Ole Miss game, I stayed in that same thread (meaning ChatGPT kept the "context" of what we had done -- i.e., it "remembered" whatever it had "learned" about what I wanted, or at least that's the way I understand it) and just used this prompt for Ole Miss:

Next game for Tennessee is against Ole Miss tomorrow. Give me a scouting report in the form of Texas. After finishing the report, double-check all factual statements.

Then for Texas A&M, I just copied that prompt, changed Ole Miss to Texas A&M and ran it again. I noticed in the "thinking" that ChatGPT was being careful about double-checking facts.

It should be possible to write a prompt or create a GPT that would make this process more reliable with less prompting, but I haven't worked on that yet.
 
  • Like
Reactions: BruisedOrange
#15
#15
Here's the prompt that started my exploration with Claude Opus 4.6 of the "consistency" differences in substitution patterns between this season and last. It came back with the information that it could write the code and do the analysis, and that Claude Code would be better only if I wanted to run this repeatedly. I then asked it to put it's output into bbcode and make it visually appealing. But here is the first prompt (after uploading the json files):

I have these two files with all the plays for the Lady Vols from their 2024-2025 and 2025-2026 season. Kim Caldwell has said that she has not been able to establish "consistent rotations" in the 2025-2026 season the way she did in the previous season. I assume by this that she had lineups or groups that played well together and that she could substitute using those groupings and get the expected performance on the floor. So, there should be some kind of consistency in the 2024-2025 groupings of players who spent time on the floor together that does not appear to the same extent in the 2025-2026 season. I think using Claude Code to set up an analysis of these two files for this is what I should do, but I don't know how to start.

Happy to share more or answer questions if anyone is interested.
 
#16
#16
For the scouting report for the LSU game, I set up a project in X and put this in as the project instructions:

This project gathers research about the Lady Vols, analyzes statistics, and develops Scouting Reports for Lady Vol fans on the Vol Nation Forum.

Role: You're a research analyst who knows how to sift information, keeping just what Vol fans need to know, and then framing it in punchy, visually appealing posts in bb code format for the Vol Nation forum.

Requirements: factual accuracy. Vol Nation participants are knowledgeable and will catch any mistakes. Use the latest data. Put greater weight on the most recent results for Tennessee and their opponents. Double-check everything. However, posts need a unique "voice" -- one with verve and panache, but also authenticity and trustworthiness.

Sources: Top statistical sites for women's basketball, articles about Tennessee and opposing team, and social media commentary, including opposing team forums and posts on X. However, all quotes of social media posts MUST be verbatim and attributed to the proper user. This is journalistic ethics and is a form of factual accuracy that MUST be adhered to.

The first version of the report totally missed the Janiah Barker situation, so I re-prompted for that. It was also even more up-beat than this version. As a human, I might have focused on how commentators have swarmed to dogpile Tennessee, led by Andraya Carter's Game Day, uh... comments? Rant? Lambasting? Free-floating negativity? Yeah. I'm still bitter about that, and still think it was more pontificating than fair commentary. But that's just me.
 
#17
#17
I just put a post in the Vandy game thread that compared the stats for PTS and REB for Barker, Cooper, and Spearman over the last three games. In doing that, I learned something:

Despite the ability to "browse the web" AIs as currently configured often do not have access to the same data we do.

When my first general attempts at this failed and I noticed the AIs seemed to be "constructing" the data from weird sources, I gave Claude the url for the box score for a game on the UT web site, then asked it to pull the data from the other games. It was able to figure out what the urls for those game's box scores would be, but then said,

I'm running into a permissions constraint — the system only allows me to fetch URLs that you explicitly provide or that appear verbatim in search results. Your LSU link worked because you typed it directly. I constructed the others from the schedule page, but they're being blocked.

As it stands today, most of these systems seem to be handicapped intentionally against being able to just "click through" links to get to data. Or, whatever. I may not be saying that right. But, bottom line, what looks like the most obvious and easily accessible source of data on a topic sometimes isn't available to them because they have been intentionally handicapped from accessing that data. I don't know if the "agentic" systems that can "run a browser" could do this better.

This seems like a VERY TEMPORARY phenomenon. The pressure to re-configure data on the web for AI access and to empower the AIs to access the best data regardless of format should drive a reduction in this effect quite rapidly. Still, it's weird.
 
  • Like
Reactions: chuckiepoo and MAD

Advertisement



Back
Top