Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

**WIP** Added qemu_tmin #2942

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

WorksButNotTested
Copy link
Collaborator

I have resurrected qemu_tmin (see here). It seems to run ok. And some of the outputs are smaller than the inputs! But there seems to be way more outputs than there are inputs? What am I doing wrong?

@domenukk
Copy link
Member

But there seems to be way more outputs than there are inputs?

Could it be related to the recent "deduplication" changes in the OnDiskCorpus #2827 ? Maybe intermediate files don't get deleted properly anymore?

@WorksButNotTested
Copy link
Collaborator Author

Does the de-duplication only deduplicate files if the content is identical? Or is it based upon the observer output? Does it favour smaller files?

Copy link
Member

@rmalmain rmalmain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be nice to add some form of test once it's working.
otherwise it's the kind of thing that can break silently

@@ -0,0 +1,300 @@
[env]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are migrating to Justfiles, once we finish it will be necessary to migrate this following the same model as qemu_cmin.

@@ -0,0 +1,6704 @@
[cargo-make] INFO - cargo make 0.37.16
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you remove this file please?

@dhanvithnayak
Copy link
Contributor

@WorksButNotTested yes files are deduplicated based on file content. the idea is that the filename is a hash of the file contents, so if we get duplicate objectives we just increase a counter in the lockfile instead of writing another copy of the testcase. if you are naming testcases yourself such that identical testcases may have different filenames, deduplication will break and we end up with duplicate files

@WorksButNotTested
Copy link
Collaborator Author

@WorksButNotTested yes files are deduplicated based on file content. the idea is that the filename is a hash of the file contents, so if we get duplicate objectives we just increase a counter in the lockfile instead of writing another copy of the testcase. if you are naming testcases yourself such that identical testcases may have different filenames, deduplication will break and we end up with duplicate files

So in my case then I'm not sure that will help me. After each of my seeds has run through the minimizer, then I will be hoping to find different (smaller) files which still generate the same coverage. So I guess I would need the test cases to be named based upon a hash of the coverage map (assuming the hash is long enough to avoid collisions). But also I would need the de-duplication to favour the smaller test case when deciding which to keep. Not sure exactly how I'd go about that though, so if you have any pointers that would be great!

@domenukk
Copy link
Member

You can always remove the testcase and then add the minimzed one

@domenukk
Copy link
Member

You can always remove the testcase and then add the minimzed one

@WorksButNotTested I think this is the best way to go forward

@WorksButNotTested WorksButNotTested changed the title Added qemu_tmin **WIP** Added qemu_tmin Feb 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants