Critiquing AI's pre-game analysis and prognostication abilities

#1

BruisedOrange

Well... known member
Joined
Oct 21, 2013
Messages
9,603
Likes
28,133
#1
I was wondering how the Florida game matched up against that AI-assisted scouting report, so I'm starting this thread where people who got to see the game(s) can compare and critique against AI's analysis & prognostications.

If there's interest, this thread could provide a place for ongoing critique of AI's game prediction skills, leaving the game threads to "higher, bio-chemical, featherless-biped" analysis. To the degree that AI is only assimilating what humans have written and posted, with time and increased sample size it should begin to assign more weight to certain sources over others and improve its accuracy. (especially if it monitors this forum)

For your critiques, below is what RetroVol (generously) posted for the Florida official game thread:

 
Last edited:
#2
#2
Great idea! People might also mention what they'd like to see. Or, if anyone wants to roll their own for a game, I'm find to let them step forward! I used Gemini only for Auburn, but I could go back to giving the same prompt to the big 4 and then start posting different ones for variety. Also could try prompting for shorter reports or for more varied ones and share some of the extremes. I just have fun playing with new technologies -- I've done it before and it sort of shaped my career.

For Florida, one of the most interesting parts was how it picked up on the heavy international route they had gone in recruiting. Very interesting, and not sure I would have picked it up just from watching the game.
 
#3
#3
I'm beginning to think of this period in our relationship with AI as kinda like getting time to date the person to whom you've been committed in a contract marriage.

You know, a chance to casually get to know the person who's gonna be telling you what to do for the rest of your life.
 
Last edited:
#6
#6
In my experience, LLMs are better at summarizing historical information than predicting the future. But why not have the LLM critic itself? I asked Gemini to do so, and here is its verdict:

Based on the box score and game recap from yesterday's contest at the Food City Center, the "Scouting Dossier" generated prior to the game was remarkably accurate in its player assessment but slightly off on the tactical narrative of how the win would be achieved.

Here is the breakdown of the Dossier's accuracy compared to the actual 76-65 Tennessee victory.

Quick Verdict: B+ (High Accuracy on Personnel, Missed the Tactical Method)

The dossier correctly identified the winner, the margin of victory (within 5 points), and the specific player who would dominate for Florida. However, it predicted a "chaos/turnover" victory, whereas the actual game was won in the trenches (rebounding) with the press being less effective than advertised.


1. Scoreboard & Outlook Predictions

MetricDossier PredictionActual Result (Jan 1, 2026)Accuracy Verdict
WinnerTennesseeTennessee✅ Correct
Final ScoreUT 88, UF 72 (+16)UT 76, UF 65 (+11)✅ Close (Within 5 pts of margin)
Game Flow"High-scoring, chaotic... cumulative effect breaks Florida."Grinding game. UT led early, UF rallied, UT pulled away late.⚠️ Mixed (Lower scoring than predicted)
Vegas LinePredicted UT -12.5 to -15.5UT won by 11.✅ Covered the spread logic

2. Personnel Scouting: The "Liv McGill" Factor

The dossier’s most impressive feat was identifying Liv McGill as a singular, overwhelming force who would play nearly the entire game.

  • Dossier Claim: "McGill is not just Florida's best player... video game numbers... focal point for all 40 minutes."
  • Actual Performance: McGill played 38 minutes and scored a game-high 32 points (13-24 FG).
  • The "Ceiling" Prediction: The dossier noted her "38-point ceiling" from the Chattanooga game. She nearly matched it with 32 points against an SEC defense.
  • The Miss: The dossier predicted she would be "prone to turnovers" (averaging 5.1). In reality, she handled the pressure reasonably well, committing 5 turnovers—high, but not the "disaster" predicted given her usage rate.
Me’Arah O’Neal Prediction vs. Reality:

  • Dossier: Predicted her as a "double-double machine" and "Interior Anchor."
  • Actual: 6 points, 9 rebounds. She struggled mightily offensively (2-6 FG) and was stifled by Tennessee’s frontcourt, though she battled on the boards.

3. Tactical Breakdown: The "Turnover" Myth

This is where the dossier diverged from reality. The prediction relied heavily on Tennessee's "Kim Caldwell System" forcing 25-30 points off turnovers.

  • Prediction: "Expect Tennessee to generate 25-30 points off turnovers... If Florida commits 20+, Tennessee wins comfortably."
  • Reality: Florida committed 16 turnovers—a manageable number. Surprisingly, Tennessee committed 17 turnovers, actually losing the turnover battle.
  • The Real Winning Edge: The dossier briefly mentioned Florida’s "size loss" (Ra Shaya Kyle), but anticipated a rebound margin of +8.2 for Florida.
    • Actual Reality: Tennessee obliterated Florida on the glass, 49-26 (including 20 offensive rebounds).
    • Correction: The dossier underestimated how much Florida’s smaller lineup would get punished by Janiah Barker (12 rebs) and Zee Spearman (10 rebs). The game was won on second-chance points, not transition chaos.

4. Weakness Identification

  • 3-Point Shooting:
    • Prediction: "Major Weakness. Florida cannot shoot the three (26.1%)."
    • Actual: Spot on. Florida shot 26.1% (6-of-23) from deep. Tennessee’s defense correctly packed the paint, forcing O'Neal into tough spots and daring the guards to shoot.
  • Depth vs. Fatigue:
    • Prediction: "Florida simply does not have the bodies... cumulative effect."
    • Actual: Validated. Florida scored only 13 points in the 4th quarter and went scoreless for a critical 3-minute stretch late, allowing UT to go on an 8-0 run to seal it. McGill (38 mins) and Reynolds (33 mins) wore down exactly as predicted.
 
#7
#7
In my experience, LLMs are better at summarizing historical information than predicting the future. But why not have the LLM critic itself? I asked Gemini to do so, and here is its verdict:

Based on the box score and game recap from yesterday's contest at the Food City Center, the "Scouting Dossier" generated prior to the game was remarkably accurate in its player assessment but slightly off on the tactical narrative of how the win would be achieved.

Here is the breakdown of the Dossier's accuracy compared to the actual 76-65 Tennessee victory.

Quick Verdict: B+ (High Accuracy on Personnel, Missed the Tactical Method)

The dossier correctly identified the winner, the margin of victory (within 5 points), and the specific player who would dominate for Florida. However, it predicted a "chaos/turnover" victory, whereas the actual game was won in the trenches (rebounding) with the press being less effective than advertised.


1. Scoreboard & Outlook Predictions

MetricDossier PredictionActual Result (Jan 1, 2026)Accuracy Verdict
WinnerTennesseeTennessee✅ Correct
Final ScoreUT 88, UF 72 (+16)UT 76, UF 65 (+11)✅ Close (Within 5 pts of margin)
Game Flow"High-scoring, chaotic... cumulative effect breaks Florida."Grinding game. UT led early, UF rallied, UT pulled away late.⚠️ Mixed (Lower scoring than predicted)
Vegas LinePredicted UT -12.5 to -15.5UT won by 11.✅ Covered the spread logic

2. Personnel Scouting: The "Liv McGill" Factor

The dossier’s most impressive feat was identifying Liv McGill as a singular, overwhelming force who would play nearly the entire game.

  • Dossier Claim: "McGill is not just Florida's best player... video game numbers... focal point for all 40 minutes."
  • Actual Performance: McGill played 38 minutes and scored a game-high 32 points (13-24 FG).
  • The "Ceiling" Prediction: The dossier noted her "38-point ceiling" from the Chattanooga game. She nearly matched it with 32 points against an SEC defense.
  • The Miss: The dossier predicted she would be "prone to turnovers" (averaging 5.1). In reality, she handled the pressure reasonably well, committing 5 turnovers—high, but not the "disaster" predicted given her usage rate.
Me’Arah O’Neal Prediction vs. Reality:

  • Dossier: Predicted her as a "double-double machine" and "Interior Anchor."
  • Actual: 6 points, 9 rebounds. She struggled mightily offensively (2-6 FG) and was stifled by Tennessee’s frontcourt, though she battled on the boards.

3. Tactical Breakdown: The "Turnover" Myth

This is where the dossier diverged from reality. The prediction relied heavily on Tennessee's "Kim Caldwell System" forcing 25-30 points off turnovers.

  • Prediction: "Expect Tennessee to generate 25-30 points off turnovers... If Florida commits 20+, Tennessee wins comfortably."
  • Reality: Florida committed 16 turnovers—a manageable number. Surprisingly, Tennessee committed 17 turnovers, actually losing the turnover battle.
  • The Real Winning Edge:The dossier briefly mentioned Florida’s "size loss" (Ra Shaya Kyle), but anticipated a rebound margin of +8.2 for Florida.
    • Actual Reality: Tennessee obliterated Florida on the glass, 49-26 (including 20 offensive rebounds).
    • Correction: The dossier underestimated how much Florida’s smaller lineup would get punished by Janiah Barker (12 rebs) and Zee Spearman (10 rebs). The game was won on second-chance points, not transition chaos.

4. Weakness Identification

  • 3-Point Shooting:
    • Prediction: "Major Weakness. Florida cannot shoot the three (26.1%)."
    • Actual: Spot on. Florida shot 26.1% (6-of-23) from deep. Tennessee’s defense correctly packed the paint, forcing O'Neal into tough spots and daring the guards to shoot.
  • Depth vs. Fatigue:
    • Prediction: "Florida simply does not have the bodies... cumulative effect."
    • Actual: Validated. Florida scored only 13 points in the 4th quarter and went scoreless for a critical 3-minute stretch late, allowing UT to go on an 8-0 run to seal it. McGill (38 mins) and Reynolds (33 mins) wore down exactly as predicted.
Oh, yeah! 👍
 
  • Like
Reactions: krichunaka
#8
#8
In my experience, LLMs are better at summarizing historical information than predicting the future. But why not have the LLM critic itself? I asked Gemini to do so, and here is its verdict:

Based on the box score and game recap from yesterday's contest at the Food City Center, the "Scouting Dossier" generated prior to the game was remarkably accurate in its player assessment but slightly off on the tactical narrative of how the win would be achieved.

Here is the breakdown of the Dossier's accuracy compared to the actual 76-65 Tennessee victory.

Quick Verdict: B+ (High Accuracy on Personnel, Missed the Tactical Method)

The dossier correctly identified the winner, the margin of victory (within 5 points), and the specific player who would dominate for Florida. However, it predicted a "chaos/turnover" victory, whereas the actual game was won in the trenches (rebounding) with the press being less effective than advertised.


1. Scoreboard & Outlook Predictions

MetricDossier PredictionActual Result (Jan 1, 2026)Accuracy Verdict
WinnerTennesseeTennessee✅ Correct
Final ScoreUT 88, UF 72 (+16)UT 76, UF 65 (+11)✅ Close (Within 5 pts of margin)
Game Flow"High-scoring, chaotic... cumulative effect breaks Florida."Grinding game. UT led early, UF rallied, UT pulled away late.⚠️ Mixed (Lower scoring than predicted)
Vegas LinePredicted UT -12.5 to -15.5UT won by 11.✅ Covered the spread logic

2. Personnel Scouting: The "Liv McGill" Factor

The dossier’s most impressive feat was identifying Liv McGill as a singular, overwhelming force who would play nearly the entire game.

  • Dossier Claim: "McGill is not just Florida's best player... video game numbers... focal point for all 40 minutes."
  • Actual Performance: McGill played 38 minutes and scored a game-high 32 points (13-24 FG).
  • The "Ceiling" Prediction: The dossier noted her "38-point ceiling" from the Chattanooga game. She nearly matched it with 32 points against an SEC defense.
  • The Miss: The dossier predicted she would be "prone to turnovers" (averaging 5.1). In reality, she handled the pressure reasonably well, committing 5 turnovers—high, but not the "disaster" predicted given her usage rate.
Me’Arah O’Neal Prediction vs. Reality:

  • Dossier: Predicted her as a "double-double machine" and "Interior Anchor."
  • Actual: 6 points, 9 rebounds. She struggled mightily offensively (2-6 FG) and was stifled by Tennessee’s frontcourt, though she battled on the boards.

3. Tactical Breakdown: The "Turnover" Myth

This is where the dossier diverged from reality. The prediction relied heavily on Tennessee's "Kim Caldwell System" forcing 25-30 points off turnovers.

  • Prediction: "Expect Tennessee to generate 25-30 points off turnovers... If Florida commits 20+, Tennessee wins comfortably."
  • Reality: Florida committed 16 turnovers—a manageable number. Surprisingly, Tennessee committed 17 turnovers, actually losing the turnover battle.
  • The Real Winning Edge:The dossier briefly mentioned Florida’s "size loss" (Ra Shaya Kyle), but anticipated a rebound margin of +8.2 for Florida.
    • Actual Reality: Tennessee obliterated Florida on the glass, 49-26 (including 20 offensive rebounds).
    • Correction: The dossier underestimated how much Florida’s smaller lineup would get punished by Janiah Barker (12 rebs) and Zee Spearman (10 rebs). The game was won on second-chance points, not transition chaos.

4. Weakness Identification

  • 3-Point Shooting:
    • Prediction: "Major Weakness. Florida cannot shoot the three (26.1%)."
    • Actual: Spot on. Florida shot 26.1% (6-of-23) from deep. Tennessee’s defense correctly packed the paint, forcing O'Neal into tough spots and daring the guards to shoot.
  • Depth vs. Fatigue:
    • Prediction: "Florida simply does not have the bodies... cumulative effect."
    • Actual: Validated. Florida scored only 13 points in the 4th quarter and went scoreless for a critical 3-minute stretch late, allowing UT to go on an 8-0 run to seal it. McGill (38 mins) and Reynolds (33 mins) wore down exactly as predicted.
LOVE IT!
 
#9
#9
In my experience, LLMs are better at summarizing historical information than predicting the future. But why not have the LLM critic itself? I asked Gemini to do so, and here is its verdict:

Based on the box score and game recap from yesterday's contest at the Food City Center, the "Scouting Dossier" generated prior to the game was remarkably accurate in its player assessment but slightly off on the tactical narrative of how the win would be achieved.

Here is the breakdown of the Dossier's accuracy compared to the actual 76-65 Tennessee victory.

Quick Verdict: B+ (High Accuracy on Personnel, Missed the Tactical Method)

The dossier correctly identified the winner, the margin of victory (within 5 points), and the specific player who would dominate for Florida. However, it predicted a "chaos/turnover" victory, whereas the actual game was won in the trenches (rebounding) with the press being less effective than advertised.


1. Scoreboard & Outlook Predictions

MetricDossier PredictionActual Result (Jan 1, 2026)Accuracy Verdict
WinnerTennesseeTennessee✅ Correct
Final ScoreUT 88, UF 72 (+16)UT 76, UF 65 (+11)✅ Close (Within 5 pts of margin)
Game Flow"High-scoring, chaotic... cumulative effect breaks Florida."Grinding game. UT led early, UF rallied, UT pulled away late.⚠️ Mixed (Lower scoring than predicted)
Vegas LinePredicted UT -12.5 to -15.5UT won by 11.✅ Covered the spread logic

2. Personnel Scouting: The "Liv McGill" Factor

The dossier’s most impressive feat was identifying Liv McGill as a singular, overwhelming force who would play nearly the entire game.

  • Dossier Claim: "McGill is not just Florida's best player... video game numbers... focal point for all 40 minutes."
  • Actual Performance: McGill played 38 minutes and scored a game-high 32 points (13-24 FG).
  • The "Ceiling" Prediction: The dossier noted her "38-point ceiling" from the Chattanooga game. She nearly matched it with 32 points against an SEC defense.
  • The Miss: The dossier predicted she would be "prone to turnovers" (averaging 5.1). In reality, she handled the pressure reasonably well, committing 5 turnovers—high, but not the "disaster" predicted given her usage rate.
Me’Arah O’Neal Prediction vs. Reality:

  • Dossier: Predicted her as a "double-double machine" and "Interior Anchor."
  • Actual: 6 points, 9 rebounds. She struggled mightily offensively (2-6 FG) and was stifled by Tennessee’s frontcourt, though she battled on the boards.

3. Tactical Breakdown: The "Turnover" Myth

This is where the dossier diverged from reality. The prediction relied heavily on Tennessee's "Kim Caldwell System" forcing 25-30 points off turnovers.

  • Prediction: "Expect Tennessee to generate 25-30 points off turnovers... If Florida commits 20+, Tennessee wins comfortably."
  • Reality: Florida committed 16 turnovers—a manageable number. Surprisingly, Tennessee committed 17 turnovers, actually losing the turnover battle.
  • The Real Winning Edge:The dossier briefly mentioned Florida’s "size loss" (Ra Shaya Kyle), but anticipated a rebound margin of +8.2 for Florida.
    • Actual Reality: Tennessee obliterated Florida on the glass, 49-26 (including 20 offensive rebounds).
    • Correction: The dossier underestimated how much Florida’s smaller lineup would get punished by Janiah Barker (12 rebs) and Zee Spearman (10 rebs). The game was won on second-chance points, not transition chaos.

4. Weakness Identification

  • 3-Point Shooting:
    • Prediction: "Major Weakness. Florida cannot shoot the three (26.1%)."
    • Actual: Spot on. Florida shot 26.1% (6-of-23) from deep. Tennessee’s defense correctly packed the paint, forcing O'Neal into tough spots and daring the guards to shoot.
  • Depth vs. Fatigue:
    • Prediction: "Florida simply does not have the bodies... cumulative effect."
    • Actual: Validated. Florida scored only 13 points in the 4th quarter and went scoreless for a critical 3-minute stretch late, allowing UT to go on an 8-0 run to seal it. McGill (38 mins) and Reynolds (33 mins) wore down exactly as predicted.
😒 Yeah, okay. So AI can accurately critique itself. [pretty cool, actually]

But can it bloviate? Can it post images that are only superficially pertinant to the discussion? Can it relentlessly harangue posters for what they posted 2 months ago? __ Thought not._ :cool: We back!
 
  • Like
Reactions: RetroVol
#10
#10
I am an A.I. moron, so anything any of you do with it impresses the hell out of me, and I appreciate you sharing your experiments with it. 🥹
 
  • Like
Reactions: RetroVol

Advertisement



Back
Top