Test data notes

Here are some notes about this testing data. This is accurate on 2025-02-03, so be weary if this is far in the future.

I ran these abstracts against the models on this day, and this is what they came back with:

Model	Abstract 1	Abstract 2	Abstract 3
deepseek-r1:latest	25%	0%	0%
granite3-dense:latest	50%	80%	80%
granite3.1-dense:latest	80%	80%	80%
llama3.1:latest	80%	23%	82%

As we add more test data we should keep this overview off different models tracked here.

Provide feedback