Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revive qemu logs - and extend the FAQ #1978

Merged
merged 3 commits into from
Sep 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 70 additions & 0 deletions docs/FAQ.md
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,76 @@ Therefore, if you still prefer the previous `paddr = vaddr` simple mapping, we h

In theory, `UC_TLB_VIRTUAL` will achieve better performance as it skips all MMU details, though not benchmarked.

## Something is wrong - I would like to dig deeper

Unicorn uses at several places logging by the qemu implementation.
This might provide a first glance what could be wrong.

The logs contains optionally the filename and the line number including
additional messages to indicate what is happening.
However, the qemu logs are partially commented-out and incomplete, but give it a try.
You might want to dig deeper - and add your own log messages where you expect or try to find the bug.

See the `unicorn/qemu/include/qemu/log.h` file for details.
To enable logs, you must recompile Unicorn with the enabled `LOGGING_ENABLED` define.

Logs are written in different log levels, which might result into a very verbose logging if enabled.
To control the log level information, two environment variables could be used.

`UNICORN_LOG_LEVEL` and `UNICORN_LOG_DETAIL_LEVEL`.

These environment variables are parsed into `uint32_t` values once, (due to performance reasons)
so set these environment variables before you execute any line of Unicorn.
Allowed are hexa-decimal, decimal and octal values, which fits into a buffer of 10 chars. (see stroul for details).

To define how detailed and what should be logged, use the following environment variables:

- `UNICORN_LOG_LEVEL`=\<32bit mask\>
- The qemu bit mask what should be logged.
- Use the value of `UINT32_MAX` to log everything.
- If no bit is set in the mask, there will be no logging.
- `UNICORN_LOG_DETAIL_LEVEL`=\<level\>
- The level defines how the filename and line is constructed.
- 0: no filename and no line is used.
- 1: full filename including the leading path is used with line information.
- 2: just the filename with line information. It might be a little confusing,
as the file name can be used in several places.
- If unsure or unwanted, leave this variable undefined or set it to 0.

As an example to set up the environment for python correctly, see the example below.

```python
import os
os.environ['UNICORN_LOG_LEVEL'] = "0xFFFFFFFF" # verbose - print anything
os.environ['UNICORN_LOG_DETAIL_LEVEL'] = "1" # full filename with line info
```

Please note that file names are statically compiled in and can reveal the paths
of the file system used during compilation.

## My code does not do what I would expect - is this a bug?

Please create an github issue and provide as much details as possible.

- [ ] Simplified version of your script / source
- Make sure that "no" external dependencies are needed.
- E.g. remove additional use of capstone or CTF tools.
- [ ] Used Unicorn git-hash commit
- Make sure to exclude any changes of you made in unicorn.
- Alternativily provide the repo link to your commit.
- [ ] Detailed explaination what is expected
- Try to verify if the instructions can be processed by qemu.
- Dumping the registers of unicorn and qemu helps a lot.
- [ ] Detailed explaination what is observed
- Describe what's going on (and what you might think about it).
- [ ] Output from your executed script
- You might have additional log messages which could be helpful.
- [ ] Output from the qemu-logs
- Try to gather more informations by enabling the qemu logging.
- [ ] More details
- Attach more details to help reproduce the bug.
- Like attaching a repo link to the CTF challenge containing the binary or source code.

## I'd like to make contributions, where do I start?

See [milestones](https://github.com/unicorn-engine/unicorn/milestones) and [coding convention](https://github.com/unicorn-engine/unicorn/wiki/Coding-Convention
Expand Down
211 changes: 199 additions & 12 deletions qemu/include/qemu/log.h
Original file line number Diff line number Diff line change
Expand Up @@ -30,20 +30,207 @@

/* Logging functions: */

/* log only if a bit is set on the current loglevel mask:
* @mask: bit to check in the mask
* @fmt: printf-style format string
* @args: optional arguments for format string
/* To verbose logging, enable the next line. */
//#define LOGGING_ENABLED // to enable logging
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add a cmake variable for this. If you are not familiar, I can do this for you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to append an commit on top - the cmake files are quite huge and I am little bit too scared right now by them 😆

I thought I had seen a refactoring of the CMake files as PR somewhere, but that might be a different project.

Anyway, I would like to see a “cmake” folder that abstracts each CPU architecture and the required files into multiple cmakefiles - instead of a monolithic approach. There are reasons doing it like this I think, and I did not thought about that (or have looked deeper into the CMakefiles.)
But that's a different topic and shouldn't be clarified here.


#ifdef LOGGING_ENABLED

#include <stdint.h>
#include <string.h>
#include <stdlib.h>
#include <stdarg.h>
#include <stdio.h>

/**
* Reads the @p env_name and tries to parse the value into an uint32_t.
* @param env_name The environment variable name to parse.
* @return The parsed value.
** 0 in case if the value can not be parsed (or is 0).
** ULONG_MAX if the value is bigger than an uint32_t.
*/
static inline uint32_t read_and_parse_env(const char* env_name)
{
uint32_t data = 0;
const char* env_data = getenv(env_name);

if (env_data != NULL) {
char buffer[11] = {0}; // 0xFFFF'FFFF\0
strncpy(buffer, env_data, sizeof(buffer) - 1);
data = (uint32_t)strtoul(buffer, NULL, 0);
}

return data;
}

/**
* Gets the log level by reading it once.
* @return The log level.
*/
static inline uint32_t get_log_level()
{
static uint64_t log_level = UINT64_MAX;

if (log_level == UINT64_MAX) {
log_level = read_and_parse_env("UNICORN_LOG_LEVEL");
}

return (uint32_t)log_level;
}

/**
* Gets the log detail level by reading it once.
* @return The detail log level.
*/
static inline uint32_t get_log_detail_level()
{
static uint64_t log_detail_level = UINT64_MAX;

if (log_detail_level == UINT64_MAX) {
log_detail_level = read_and_parse_env("UNICORN_LOG_DETAIL_LEVEL");
}

return (uint32_t)log_detail_level;
}

/**
* Checks if the @p log_level is active.
* @param log_level The log level to be checked.
* @return True if the log level is active.
*/
#define qemu_log_mask(MASK, FMT, ...)
static inline bool is_log_level_active(uint32_t log_level)
{
const uint32_t active_log_level = get_log_level();
const bool is_active = (active_log_level & log_level) == log_level;

/* log only if a bit is set on the current loglevel mask
* and we are in the address range we care about:
* @mask: bit to check in the mask
* @addr: address to check in dfilter
* @fmt: printf-style format string
* @args: optional arguments for format string
return is_active;
}

/**
* Checks if the logging is enabled.
* @return True if enabled, else false.
*/
#define qemu_log_mask_and_addr(MASK, ADDR, FMT, ...)
static inline bool is_logging_enabled()
{
const bool log_level = get_log_level();

return log_level != 0;
}

/**
* Gets the filename of the caller on given @p detail_level.
* @param filename The filename to process on.
* @param detail_level The level of detail of the filename.
** 0: Returns an empty string.
** 1: Returns the full filename including it's path.
** 2: Returns just the filename (to shorten the log).
* @return always an valid null-terminated string. Do NOT free it.
*/
static inline const char* const get_detailed_filename(const char* filename,
int detail_level)
{
filename = (filename != NULL) ? filename : "";
const char* resulting_filename = filename;

#if (defined(WIN32) || defined(WIN64) || defined(_WIN32) || defined(_WIN64))
const char path_separator = '\\';
#else
const char path_separator = '/';
#endif

switch (detail_level) {
default:
case 0:
resulting_filename = "";
break;
case 1:
resulting_filename = filename;
break;
case 2:
resulting_filename = strrchr(filename, path_separator);

if (resulting_filename == NULL) {
resulting_filename = filename;
}
else {
++resulting_filename;
}

break;
}

return resulting_filename;
}

/**
* Prints the formatted log message with details if enabled.
* @param mask The log mask to log on.
* @param filename The filename of the caller.
* @param line The line number of the caller.
* @param fmt Printf-style format string.
* @param args optional arguments for the format string.
*/
static inline void print_log(uint32_t mask, const char* filename, uint32_t line, const char* fmt, ...)
{
if ((mask & get_log_level()) == 0) {
return;
}

const uint32_t log_detail_level = get_log_detail_level();

if (log_detail_level > 0) {
const char* detailed_filename = get_detailed_filename(filename, log_detail_level);
printf("[%s:%u] ", detailed_filename, line);
}

va_list argptr;

va_start(argptr, fmt);
vfprintf(stdout, fmt, argptr);
va_end(argptr);
}

/**
* Logs only if the right log level is set.
* @param mask The log mask to log on.
* @param fmt Printf-style format string.
* @param args optional arguments for the format string.
*/
#define LOG_MESSAGE(mask, fmt, ...) \
do { \
print_log(mask, __FILE__, __LINE__, fmt, ## __VA_ARGS__); \
} while (0)
#else
#define LOG_MESSAGE(mask, fmt, ...)

/**
* Dummy implementation which returns always false.
* @return Always false.
*/
static inline bool is_logging_enabled()
{
return false;
}

/**
* Dummy implementation which returns always false.
* @param level The log level to be checked.
* @return Always false.
*/
static inline bool is_log_level_active(uint32_t level)
{
(void)level;

return false;
}
#endif /* LOGGING_ENABLED */

/**
* Logs only if the right log level is set.
* @param mask The log mask to log on.
* @param fmt Printf-style format string.
* @param args Optional arguments for the format string.
*/
#define qemu_log_mask(mask, fmt, ...) \
LOG_MESSAGE(mask, fmt, ## __VA_ARGS__)

#endif /* QEMU_LOG_H */
4 changes: 2 additions & 2 deletions qemu/tcg/tcg.c
Original file line number Diff line number Diff line change
Expand Up @@ -3728,7 +3728,7 @@ int tcg_gen_code(TCGContext *s, TranslationBlock *tb)
TCGOp *op;

#ifndef NDEBUG
if (getenv("UNICORN_DEBUG")) {
if (is_log_level_active(CPU_LOG_TB_IN_ASM)) {
tcg_dump_ops(s, false, "TCG before optimization:");
}
#endif
Expand Down Expand Up @@ -3777,7 +3777,7 @@ int tcg_gen_code(TCGContext *s, TranslationBlock *tb)
#endif

#ifndef NDEBUG
if (getenv("UNICORN_DEBUG")) {
if (is_log_level_active(CPU_LOG_TB_IN_ASM)) {
tcg_dump_ops(s, false, "TCG before codegen:");
}
#endif
Expand Down
Loading