M7350/qcom-opensource/kernel/kernel-tests/memory_prof/README.txt

548 lines
16 KiB
Plaintext
Raw Permalink Normal View History

2024-09-09 08:57:42 +00:00
memory_prof is a suite of tools for profiling memory-related
performance as well as catching regressions. The suite is named after
the original memory_prof program, but now includes other programs and
scripts that can be used for memory profiling.
__
/ _|
_ __ ___ ___ _ __ ___ ___ _ __ _ _ _ __ _ __ ___ | |_
| '_ ` _ \ / _ \ '_ ` _ \ / _ \| '__| | | | | '_ \| '__/ _ \| _|
| | | | | | __/ | | | | | (_) | | | |_| | | |_) | | | (_) | |
|_| |_| |_|\___|_| |_| |_|\___/|_| \__, | | .__/|_| \___/|_|
__/ |_____| |
|___/______|_|
Usage: See memory_prof -h
2024-09-09 08:52:07 +00:00
Description:
2024-09-09 08:57:42 +00:00
These tests are useful for catching performance regressions in Ion or
general memory code (using the -e and -k options). They can also catch
other Ion regressions by performing some basic sanity tests (the -b,
-m, and -l options).
2024-09-09 08:52:07 +00:00
Notes:
2024-09-09 08:57:42 +00:00
This test suite is accompanied by a kernel module that must be
inserted for certain test cases (namely -k and -m). The memory_prof.sh
script will take care of inserting the kernel module and running the
memory_prof binary for you. However, sometimes it's useful to be able
run the memory_prof binary directly without inserting the kernel
module.
Information about the format of allocation profiles (specified
with -i) can be found at the end of this document in Appendix A.
Target support: All
__ _
/ _| | |
_ __ ___ ___ _ __ ___ | |_ ___ __ _ ___| |_
| '_ ` _ \ / _ \ '_ ` _ \| _/ _ \/ _` / __| __|
| | | | | | __/ | | | | | || __/ (_| \__ \ |_
|_| |_| |_|\___|_| |_| |_|_| \___|\__,_|___/\__|
Usage: See memfeast -h
Description:
memfeast reliably and predictably forces the system into a
low-memory condition.
Target support: All
-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
Appendix A: Allocation Profiles for memory_prof
custom "allocation profile" input files can be specified with -i.
The format of the allocation profile file is a comma-separated values file
with the following columns:
op
[rest...]
The `op' field specifies what kind of operation this line
holds. The following operations are currently supported:
- alloc
- sleep
- print
- simple_alloc
- simple_basic_sanity
- simple_profile
- simple_free
- alloc_pages
- create_unused_client
- free_all_unused_clients
- user_alloc
- iommu_map_range
- conc
- lat
- stability
Each operation is described in detail below.
The remaining fields ([rest...]) are defined differently for
different values of `op'.
In all cases, all defined fields are *required* and cannot be left
empty (e.g. `something,,other'). For example, if you don't have
any flags to pass for alloc, put a 0 rather than leaving it blank.
o `op' == alloc
When `op' == alloc, an ION_IOC_ALLOC/ION_IOC_FREE will be
performed and profiled. Additionally, mmap'ing and memset'ing
the buffer will be performed and profiled if profile_mmap and
profile_memset are set.
The following remaining fields are defined:
reps
heap_id
flags
alloc_size
alloc_size_label
quiet_on_failure
profile_mmap
profile_memset
- reps :: How many times to repeat this allocation
- heap_id :: Heap to use for allocation. Should correspond to a
heap_id from `enum ion_heap_ids'. E.g.:
ION_CP_MM_HEAP_ID
- flags :: Flags to be used for allocation. Can parse bitwise OR'd
ION_FLAG_* constants (e.g.:
ION_SECURE|ION_FLAG_CACHED). No spaces please.
- alloc_size :: The size of the buffer to be allocated. Can be
any valid size string (e.g. "4KB", "2MB", etc). Supported
suffixes are "KB" "MB" and "GB" (or no suffix for bytes).
- alloc_size_label :: A human- (and script-) readable string
describing the allocation size
- quiet_on_failure :: Whether we should print an error message if
this allocation fails
- profile_mmap :: Whether we should profile mmap
- profile_memset :: Whether we should profile memset
Blank lines and lines beginning with '#' are skipped.
See alloc_profiles/general.txt for a full example.
o `op' == sleep
When `op' == sleep, a usleep will be inserted with the specified
number of microseconds.
The following remaining fields are defined:
time_us
- time_us :: The time (in microseconds) to sleep
o `op' == print
When `op' == print, the remaining text on the line is printed to
stdout.
The following remaining fields are defined:
rest
- rest :: The text to print
o `op' == simple_alloc
When `op' == simple_alloc, an ION_IOC_ALLOC will be performed. A
matching ION_IOC_FREE will *not* be performed. To free a buffer
allocated with `simple_alloc' you should use the `simple_free'
op (defined below) with the same alloc_id field.
The following remaining fields are defined:
alloc_id
heap_id
flags
alloc_size
alloc_size_label
- alloc_id :: a user-defined ID that can be used in a
`simple_free' line (see below) to free this allocation. This
can actually be any string.
- heap_id, flags, alloc_size, alloc_size_label :: the same
was as for the `alloc' op, above
See `simple_free' for an example of how this can be used.
o `op' == simple_basic_sanity
When `op' == simple_basic_sanity, we iterate through all buffers
previously allocated with `simple_alloc' until we find one with
a matching alloc_id. We do basic sanity testing of the first one
we find with the same algorithm as `./memory_prof -b'.
The following remaining fields are defined:
alloc_id
Important: this operation will always fail on anything except a
freshly allocated buffer. For example, the following sequence is
BAD:
simple_alloc,pizza,ION_SYSTEM_HEAP_ID,ION_FLAG_CACHED,0x100000,1MB
simple_profile,pizza
simple_basic_sanity,pizza
But the following sequence is GOOD:
simple_alloc,pizza,ION_SYSTEM_HEAP_ID,ION_FLAG_CACHED,0x100000,1MB
simple_basic_sanity,pizza
simple_profile,pizza
o `op' == simple_profile
When `op' == simple_profile, we iterate through all buffers
previously allocated with `simple_alloc' until we find one with
a matching alloc_id. We profile the first one we find with the
same algorithm as `op' == alloc.
The following remaining fields are defined:
alloc_id
o `op' == simple_free
When `op' == simple_free, an ION_IOC_FREE will be performed on
the buffer identified by alloc_id. See the `simple_alloc' op
above.
The following remaining fields are defined:
alloc_id
- alloc_id :: the user-defined ID that was used in an earlier
`simple_alloc'
Here's an example of an allocation profile using
simple_alloc/simple_profile/simple_free:
simple_alloc,1,ION_SYSTEM_HEAP_ID,ION_FLAG_CACHED,0x100000,1MB
simple_alloc,pizza,ION_SYSTEM_HEAP_ID,ION_FLAG_CACHED,0x100000,1MB
simple_profile,pizza
# there are now two Ion buffers allocated. Now free them both:
simple_free,1
simple_free,pizza
o `op' == alloc_pages
alloc_pages is used to directly profile the kernel's buddy
allocator.
The following remaining fields are defined:
order
gfp_flags
- order :: the order of the page to allocate with the kernel's
`alloc_pages' routine.
- gfp_flags :: the gfp flags to use. Note that these are not
real gfp flags as known by the kernel. Since the kernel's
actual gfp flags are not exported to userspace and it would be
too much effort to try to mirror all of them, we define some
new flags for the most commonly used gfp flags. Currently
supported flags:
MP_GFP_KERNEL
MP_GFP_HIGHMEM
MP_GFP_ZERO
MP_GFP_HIGHUSER
MP_GFP_NOWARN
MP_GFP_NORETRY
MP_GFP_NO_KSWAPD
MP_GFP_WAIT
MP_GFPNOT_WAIT
You can't compose gfp flags the same way you can in the kernel
like: GFP_KERNEL & ~GFP_WAIT since these are simple,
independent bitfields. To accomplish not'ing something out, a
dedicated flag must be used, like MP_GFPNOT_WAIT. Note,
however, that these *can* be |'d together just like the flags
in `alloc' and `simple_alloc', e.g.: MP_GFP_KERNEL|MP_GFP_ZERO.
Here's an example allocation profile using alloc_pages:
alloc_pages,1,MP_GFP_KERNEL
alloc_pages,1,MP_GFP_KERNEL|MP_GFP_ZERO
alloc_pages,9,MP_GFP_KERNEL
alloc_pages,9,MP_GFP_KERNEL|MP_GFP_ZERO
o `op' == create_unused_client
When `op' == create_unused_client, we will create a new Ion
client by opening /dev/ion. No allocations or any other work
whatsoever is performed. No other fields are used for this
op. This is useful for profiling the client creation and
destruction code.
o `op' == free_all_unused_clients
When `op' == free_all_unused_clients, all clients previously
created with create_unused_client are free'd by close()'ing all
file descriptors returned by the previous calls to open().
o `op' == user_alloc
When `op' == user_alloc, profile the libc `malloc' or `mmap'
functions.
The following remaining fields are defined:
allocator
alloc_size
usage_size
usage_fn
- allocator :: the underlying memory allocation function to
profile. Currently supported values: "malloc", "mmap". Note
that when mmap is used we also pass the MAP_POPULATE flag, so
the result will also include the time taken to fault the pages
in.
- alloc_size :: the size of the buffer to profile. Can be any
valid size string (e.g. "4KB", "2MB", etc). Supported suffixes
are "KB" "MB" and "GB" (or no suffix for bytes).
- usage_size :: how many of the allocated bytes to run through
usage_fn. Can be any valid size string, similar to alloc_size.
- usage_fn :: the function to use to fiddle with the allocated
memory. Supported values:
- nop :: don't touch the memory
- memset-n :: memset the memory to `n'
`n' is passed to `strtol' with a `base' of 0 so it can
take any of the values supported there (see strtol(3)).
e.g. memset-0xa5 would result in memset(buf, 0xa5, size)
memset-0 would result in memset(buf, 0, size)
o `op' == iommu_map_range
o `op' == iommu_unmap_range
o `op' == iommu_map
o `op' == iommu_unmap
When `op' equals one of the four operations above, profile the
corresponding api in the kernel.
The following remaining fields are defined:
context_name
chunk_order
nchunks
iterations
prot
flags
- context_name :: the Iommu context to use
- chunk_order :: the order of the pages to be used for each
chunk
- nchunks :: how many chunks to allocate
- iterations :: number of interations to run
- prot :: protection flags to be used for the mapping. Similar
to the gfp_flags field for the `alloc_pages' op, the real
kernel Iommu prot flags are not exported to userspace so we
create some of our own definitions that later get mapped to
the kernel's Iommu prot flags
MP_IOMMU_WRITE
MP_IOMMU_READ
MP_IOMMU_CACHE
The note on the gfp_kernel field about flag composition also
applies here.
- flags :: Addition flags used to control the test
MP_IOMMU_ATTACH - Attach to context bank
MP_IOMMU_SECURE - Test secure iommu functionality
The note on the gfp_kernel field about flag composition also
applies here.
o `op' == iommu_attach
o `op' == iommu_detach
When `op' equals one of the two operations above, profile the
corresponding api in the kernel.
The following remaining fields are defined:
context_name
iterations
flags
- context_name :: the Iommu context to use
- iterations :: number of interations to run
- flags :: Addition flags used to control the test
MP_IOMMU_SECURE - Test secure iommu functionality
The note on the gfp_kernel field about flag composition also
applies here.
o `op' == ion_cache
When `op' == cache, an ION_IOC_CLEAN_CACHES/ION_IOC_INV_CACHES/
ION_IOC_CLEAN_INV_CACHES will be performed and profiled on an
allocated ion buffer. The buffer will be free'd after the
profiling is complete.
The following remaining fields are defined:
reps
heap_id
flags
alloc_size
alloc_size_label
cache_clean
cache_invalidate
- reps :: How many times to repeat this allocation
- heap_id :: Heap to use for allocation. Should correspond to a
heap_id from `enum ion_heap_ids'. E.g.:
ION_CP_MM_HEAP_ID
- flags :: Flags to be used for allocation. Can parse bitwise OR'd
ION_FLAG_* constants (e.g.:
ION_SECURE|ION_FLAG_CACHED). No spaces please.
- alloc_size :: The size of the buffer to be allocated. Can be
any valid size string (e.g. "4KB", "2MB", etc). Supported
suffixes are "KB" "MB" and "GB" (or no suffix for bytes).
- alloc_size_label :: A human- (and script-) readable string
describing the allocation size
- cache_clean:: Whether we want to profile cache clean operation
- cache_invalidate :: Whether we want to profile cache invalidate
operation
For profiling ION_ION_CLEAN_INV_CACHES both cache_clean and
cache_invalidate should be true.
Blank lines and lines beginning with '#' are skipped.
See alloc_profiles/cache_ops.txt for a full example.
o `op' == conc
When `op' == conc, concurrency KPI tests are performed.
The following remaining fields are defined:
type
adj
writers
debug
repeat
pages
delay
count
- type :: Type of test. 0:geometric, 1:arithmetic
- adj :: the adj level to which a task is moved after allocation
- writers :: number of parallel file writers
- debug :: verbosity. 0 or 1
- repeat :: repeat the test "repeat" number of times
- pages :: NA, 0
- delay :: NA, 0
- count :: NA, 0
o `op' == lat
When `op' == lat, latency KPI tests are performed.
The following remaining fields are defined:
type
adj
writers
debug
- type :: Type of test. 0:geometric, 1:arithmetic, 2:constant
- adj :: the adj level to which a task is moved after allocation
- writers :: number of parallel file writers
- debug :: verbosity. 0 or 1
- repeat :: repeat the test "repeat" number of times
- pages :: The number of pages to be alocated when type is "constant"
- delay :: The delay between each "pages" allocation when type is "constant"
- count :: The number of "pages" allocation when type is "contant"
o `op' == stability
When `op' == stability, stability tests are performed.
The following remaining fields are defined:
type
adj
writers
debug
- type :: Type of test. 0:geometric, 1:arithmetic
- adj :: the adj level to which a task is moved after allocation
- writers :: number of parallel file writers
- debug :: verbosity. 0 or 1
- repeat :: NA, 0
- pages :: NA, 0
- delay :: NA, 0
2024-09-09 08:52:07 +00:00
2024-09-09 08:57:42 +00:00
- count :: NA, 0