Friday, November 10, 2017

Deep Learning Basic Concepts


Pruning :  Pruning is a technique in machine learning that reduces the size of decision trees by removing section of the tree that provided little power to classify instances. Pruning reduces the complexity of the final classifier and hence improves the predictive accuracy by the reduction of over-fitting.


Over-Fitting: In the process of over-fitting , the performance of the training examples still increases but the performances of the unseen data becomes worse.

    Usually a learning algorithm is trained using some set of "trained data". Exemplary situations of which the desired output is known. The goal is that the algorithm will also perform well in predicting the output when fed "validation data" that was not encountered during its training.

    Making the function more complex will improve the performance in trained data set but it will effect the more error rate ( or less prediction)  rate in unseen data. This situation called Over-Fitting.


Supervised Learning: A Supervised learning algorithm analyzes the training data and produces the inferred function, which can be used for mapping new examples. An optimal scenario will allow for the algorithm to correctly determine the class labels for unseen instances.  This requires the learning algorithm to generalize from the training data to unseen situations in a reasonable way.


https://dataaspirant.com/2014/09/19/supervised-and-unsupervised-learning/


Unsupervised Learning:  Its a type of  machine learning algorithm used to draw inferences from data-sets consisting of input data without labeled responses . The most common unsupervised learning is cluster analysis, which is used for exploratory  data analysis and find hidden pattern or grouping in data.


Logistic Regression :  The logistic regression is one member of supervised classification algorithm family. The building block concepts of logistic regression can be helpful in deep learning while building the neural networks.
 
     Logistic regression classifier is more like a linear classifier which uses the calculated digits (score) to predict the target class.

    Logistic regression model will take the feature values and calculate the probabilities using sigmoid or softmax functions.

    The  sigmoid function used for binary classification problems and Sofmax function used for multi-classification problems.

https://dataaspirant.com/2014/10/02/linear-regression/

https://dataaspirant.com/2017/03/02/how-logistic-regression-model-works/

http://dataaspirant.com/2017/04/15/implement-logistic-regression-model-python-binary-classification/

https://dataaspirant.com/2017/03/07/difference-between-softmax-function-and-sigmoid-function/



Learning Python

Some of simple new features in Python that are not exist in C:

Getting integer result with devision :

x = 50 / 6  = > x = 8.333  (float value)

If we want integer operation try with "//" mark.

x = 50//6 = 8


int(A) => returns error 


Name = "Michael Jackson"

Name[0]: M
Name[14] : n

We can also use negative indexing for strings.

Name[-15]:M
Name[-1]: n


String Slicing:
Name[0:4] = Mich
Name[8:12]= Jack

String Stride:



Name[::2] = "McalJcsn"  => 2 indicates every 2nd variable.


String Stride and Slice:


Namw[0:5:2]="Mca"

0:5 means = [0:5) = It includes 0 and excludes 5. So consider only 0 to 4 .

[0:5:2] means Slice with 0 to 4 and stride with every 2nd variable with that slice limit.


Tuples :
Tuples are written as comma separated elements within parentheses.

Tuples are immutable.   



Ratings[2] = 4 gives error. Tuples are immutable . We can not change values in Tuples.

Due to immutable property of Tuple , if we want to manipulate tuple we need to create new Tuple.


Lists:

Lists are mutable. Lists are represented in Square Brackets.

Extend method will add as separate elements in the final list. See "pop" and 10 become separate elements in the list.

Append method will add as list as single element. See ["pop" ,10] becomes a list in the list at 3rd position.

List cloning:
A = [ "Cloning" , test , 10 , 2.1]
B = A   => Both A and B referring to the same list . Both are having same reference.

B = A [ : ]

Here Both A and B having same list elements because B formed by new copy of the elements from A. 


Sets:
Sets are created using Curly Brackets. 


Dictionaries:

dir(NameofObject): Will return the list of attributes in that class. 

Sunday, February 19, 2017

Some of Kernel Debugging techniques - Compile time flags

To enable all logs in a specific file by compile time:

CFLAGS_amdgpu_amdkfd_gpuvm.o := -DDEBUG

Adding below flag in make file will make sure all the logs from amdgpu_amdkfd_gpuvm.c will be available at dmesg.


Trace function calls in a specific file by compile time options:

CFLAGS_amdgpu_amdkfd_gpuvm.o := -DDEBUG  -finstrument-functions

Generate instrumentation calls for entry and exit to functions. Just after function entry and just before function exit,

the following profiling functions will be called with the address of the current function and its call site. […]

          void __cyg_profile_func_enter (void *this_fn,
                                         void *call_site);
          void __cyg_profile_func_exit  (void *this_fn,
                                         void *call_site);
[…]

FTRACE will not provide information about static function calls. But -finstrument-functions
will allow to print all function calls executed in that file.

Need to define these 2 functions as -finstrument-functions will include
__cyg_profile_func_enter at entry and at exit __cyg_profile_func_exit.

void __cyg_profile_func_enter (void *this_fn, void *call_site) {
    printk( "entering %pS\n", this_fn );
}

void __cyg_profile_func_exit (void *this_fn, void *call_site) {
    printk( "leaving %pS\n", this_fn );
}

%pS will print function name from function pointer.

Adding below information just for refresh:

ltrace is used to trace a process's library call.

strace is used to trace system call.

pstack - print a stack trace of running processes

suppose a program stops at a point , we would like to know the call stack

$ sudo pstack `pidof glxgears`

Tuesday, September 27, 2016

LINUX SUSPEND / RESUME DEBUGGING TECHNIQUES

                       

initcall_debug : Adding the initcall_debug boot option to the kernel cmdline will trace initcalls and the driver pm callbacks during boot, suspend, and resume.

no_console_suspend : Adding the no_console_suspend boot option to the kernel cmdline disables suspending of consoles during suspend/hibernate.

ignore_loglevel :  Adding the ignore_loglevel boot option to the kernel cmdline prints all kernel messages to the console no matter what the current loglevel is, which is useful for debugging.

Serial console : To enable serial console, add console=ttyS0,115200 and no_console_suspend to the kernel cmdline.

Refer http://ayyappa-ch.blogspot.in/2015/07/serial-console-logging.html

Dynamic debug : Dynamic debug is designed to allow you to dynamically enable/disable kernel code to obtain additional kernel information. Currently, if  CONFIG_DYNAMIC_DEBUG is set, then all pr_debug()/dev_debug() calls can be dynamically enabled per-callsite.

Refer https://lwn.net/Articles/434856/

pm_async, pm_test:
Refer https://01.org/blogs/rzhang/2015/best-practice-debug-linux-suspend/hibernate-issues

enable PM_DEBUG, and PM_TRACE:
use a script like this:

#!/bin/sh
sync
echo 1 > /sys/power/pm_trace
echo mem > /sys/power/state ( or use suspend option from GUI)

   to suspend

if it doesn't come back up (which is usually the problem), reboot by holding the power button down, and look at the dmesg output for things like

Magic number: 4:156:725
hash matches drivers/base/power/resume.c:28
hash matches device 0000:01:00.0

which means that the last trace event was just before trying to resume  device 0000:01:00.0. Then   figure out what driver is controlling that device (lspci and /sys/devices/pci* is your friend), and see if you can fix it, disable it, or trace into its resume function.

If no device matches the hash (or any matches appear to be false positives), the culprit may be a device from a loadable kernel module that is not loaded until after the hash is checked. You can check the hash against the current devices again after more modules are loaded using sysfs:

cat /sys/power/pm_trace_dev_match
echo 1 > /sys/power/pm_trace

One of my issue pm_trace_dev_match shows acpi. It means issue exist in BIOS.

 Refer https://www.kernel.org/doc/Documentation/power/s2ram.txt


analyze_suspend : The analyze_syspend tool provides the capability for system developers to visualize the activity between suspend and resume, allowing them to identify inefficiencies and bottlenecks. For example, you can use following command to start:

./analyze_suspend.py -rtcwake 30 -f -m mem
And 30 seconds later the system resumes automatically and generates 3 files in the ./suspend-yymmddyy-hhmmss directory:

mem_dmesg.txt  mem_ftrace.txt  mem.html

You can first open the mem.html file with a browser, and then dig into mem_ftrace.txt for data details. You can get the analyze_suspend tool via git:

git clone https://github.com/01org/suspendresume.git

For more details, go to the homepage: https://01.org/suspendresume.

Test result: mem_dmesg.txt mem_ftrace.txt mem.html


Log Files: https://drive.google.com/drive/folders/0B_UViXaGblZQcHA1Mk1UUzB0V1E

Suspend/Resume Flow:





References:
1) https://01.org/blogs/rzhang/2015/best-practice-debug-linux-suspend/hibernate-issues
2) https://www.kernel.org/doc/Documentation/power/s2ram.txt
3) https://lwn.net/Articles/434856/
4) https://github.com/01org/suspendresume
5) https://wiki.ubuntu.com/DebuggingKernelSuspend

Wednesday, August 3, 2016

Google profiler for Performance and Memory Analysis



Goolge profiler tool instalation for Ubuntu:
sudo apt-get install google-perftools

Analyse memory consumption:
LD_PRELOAD=/usr/lib/libtcmalloc.so.0.0.0 HEAPPROFILE=gpt-heapprofile.log ./your-program

To analyse mmp please set HEAP_PROFILE_MMAP environment variable to TRUE.

Performance analysis:
LD_PRELOAD=/usr/lib/libprofiler.so.0.4.5 CPUPROFILE=/home/amd/gst-log gst-launch-1.0 -f filesrc location= ./1080p_H264.mp4 ! qtdemux ! h264parse ! vaapidecode ! filesink location= test.yuv


Convert Data to pdf format:
google-pprof --pdf  /usr/bin/python /home/amd/gst-log >  profile_output.pdf

Text output can be obtained by typing:
google-pprof --text /usr/bin/python /home/amd/gst-log > profiling_output.txt

The file "/home/amd/gst-log" can also be analyzed with some specific graphical interfaces like "kcachegrind".

To prepare de data for kcachegrind type:
   google-pprof --callgrind /usr/bin/python /home/amd/gst-log > profiling_kcachegrind.txt
 
visualize the information use kcachegrind:
   kcachegrind profiling_kcachegrind.txt &

Example view of Performance Analysis:





References:
http://goog-perftools.sourceforge.net/doc/cpu_profiler.html

http://alexott.net/en/writings/prog-checking/GooglePT.html

http://kratos-wiki.cimne.upc.edu/index.php/How_to_Profile_an_application_(using_google-perftools)

http://stackoverflow.com/questions/10874308/how-to-use-google-perf-tools




Tuesday, August 2, 2016

Debugging using gdb Tracepoints


Trace command: 
The trace command is very similar to the break command. Its argument can be a source line, a function name, or an address in the target program.

The trace command defines a tracepoint, which is a point in the target program where the debugger will briefly stop, collect some data, and then allow the program
to continue.

Setting a tracepoint or changing its commands doesn't take effect until the next tstart command.

(gdb) trace foo.c:121    // a source file and line number

(gdb) trace +2           // 2 lines forward

(gdb) trace my_function  // first source line of function

(gdb) trace *my_function // EXACT start address of function

(gdb) trace *0x2117c4    // an address

(gdb) delete trace 1 2 3 // remove three tracepoints

(gdb) delete trace       // remove all tracepoints

(gdb) info trace        // trace points info


Starting and Stopping Trace Experiment:

tstart : It starts the trace experiment, and begins collecting data.

tstop :  It ends the trace experiment, and stops collecting data.

tstatus : This command displays the status of the current trace data collection.


Enable and Disable Tracepoints:

disable tracepoint [num] : Disable tracepoint num, or all tracepoints if no argument num is given.

enable tracepoint [num] : Enable tracepoint num, or all tracepoints.


Tracepoint Passcounts:

passcount [n [num]] : Set the passcount of a tracepoint. The passcount is a way to automatically stop a trace experiment. If a tracepoint's passcount is n,  then the trace experiment will be automatically stopped on the n'th time that tracepoint is hit. If the tracepoint number num is not specified, the passcount command sets the passcount of the most recently defined tracepoint. If no passcount is given, the trace experiment will run until stopped explicitly by the user.

Examples:
(gdb) passcount 5 2 // Stop on the 5th execution of  tracepoint 2

(gdb) passcount 12  // Stop on the 12th execution of the most recently defined tracepoint.
(gdb) trace foo
(gdb) pass 3
(gdb) trace bar
(gdb) pass 2
(gdb) trace baz
(gdb) pass 1        // Stop tracing when foo has been
                           // executed 3 times OR when bar has
                           // been executed 2 times
                           // OR when baz has been executed 1 time.


Tracepoint Action Lists:

actions [num] : This command will prompt for a list of actions to be taken when the tracepoint is hit. If the tracepoint number num is not specified, this command sets the actions for the one that was most recently defined . You specify the actions themselves on the following lines, one action at a time,
and terminate the actions list with a line containing just end.

(gdb) collect data // collect some data

(gdb) while-stepping 5 // single-step 5 times, collect data

(gdb) end              // signals the end of actions.


collect expr1, expr2, ...
Collect values of the given expressions when the tracepoint is hit. This command accepts a comma-separated list of any valid expressions.

In addition to global, static, or local variables, the following special arguments are supported:

$regs
collect all registers
$args
collect all function arguments
$locals
collect all local variables.

Example:

(gdb) trace gdb_c_test
(gdb) actions
Enter actions for tracepoint #1, one per line.
> collect $regs,$locals,$args
> while-stepping 11
  > collect $regs
  > end
> end
(gdb) tstart
[time passes ...]
(gdb) tstop


Using the collected data:

tfind start
Find the first snapshot in the buffer. This is a synonym for tfind 0 (since 0 is the number of the first snapshot).
tfind none
Stop debugging trace snapshots, resume live debugging.
tfind end
Same as `tfind none'.
tfind
No argument means find the next trace snapshot.

The tracepoint facility is currently available only for remote targets.





Reference Link:

ftp://ftp.gnu.org/old-gnu/Manuals/gdb/html_chapter/gdb_10.html

http://stackoverflow.com/questions/3691394/gdb-meaning-of-tstart-error-you-cant-do-that-when-your-target-is-exec

http://stackoverflow.com/questions/38716790/gdb-meaning-of-tstop-error-you-cant-do-that-when-your-target-is-multi-thread

Linux Kernel Tracepoints , TRACE_EVENT() macro and Perf Tool



Why Tracepoints needed:

It is not feasible for the debugger to interrupt the program's execution long enough for the developer to learn anything helpful about its behavior. If the program's correctness depends on its real-time behavior, delays introduced by a debugger might cause the program to change its behavior drastically, or perhaps fail, even when the code itself is correct. It is useful to be able to observe the program's behavior without interrupting it.

What is Trace Points:

A tracepoint placed in code provides a hook to call a function that you can provide at runtime.

A tracepoint can be "on" or "off".

When a tracepoint is "off" it has no effect, except for adding a tiny time penalty and space penalty .

When a tracepoint is "on", the function you provide is called each time the tracepoint is executed, in the execution context of the caller.

When the function provided ends its execution, it returns to the caller.

You can put tracepoints at important locations in the code.

Unlike Ftracer , Trace Point can record local variables of the function.

The tracepoint included a function call in the kernel code that, when enabled, would call a callback function passing the parameters of the tracepoint to that function as if the callback function was called with those parameters.

TRACE_EVENT() macro was specifically made to allow a developer to add tracepoints to their subsystem and have Ftrace automatically be able to trace them.

The anatomy of the TRACE_EVENT() macro:
It must create a tracepoint that can be placed in the kernel code.

It must create a callback function that can be hooked to this tracepoint.

The callback function must be able to record the data passed to it into the tracer ring buffer in the fastest way possible.

It must create a function that can parse the data recorded to the ring buffer and translate it to a human readable format that the tracer can display to a user.


Playing with trace events:

cd /sys/kernel/debug/tracing/events

root@amd-PADEMELON:/sys/kernel/debug/tracing/events/drm# ls
drm_vblank_event  drm_vblank_event_delivered  drm_vblank_event_queued  enable  filter

echo 1 > ./drm/enable

The enable files are used to enable a tracepoint. The enable file in the events directory can enable or disable all events in the system, the enable file in one of the system's directories can enable or disable all events within the system, and the enable file within the specific event  directory can enable or disable that event.


Tracepoint logs can be seen with Ftrace logs :

 3) + 20.320 us   |  dm_crtc_high_irq [amdgpu]();
 0)               |  /* drm_vblank_event_queued: pid=2430, crtc=0, seq=297608 */
 0)               |  send_vblank_event [drm]() {
 0)               |  /* drm_vblank_event_delivered: pid=2430, crtc=0, seq=297608 */
 0)   3.858 us    |  }
 3) + 24.783 us   |  dm_crtc_high_irq [amdgpu]();
 3)               |  dm_pflip_high_irq [amdgpu]() {
 3)               |    drm_send_vblank_event [drm]() {
 3)               |      send_vblank_event [drm]() {
 3)               |        /* drm_vblank_event_delivered: pid=0, crtc=0, seq=297609 */
 3) + 33.556 us   |      }
 3) + 35.227 us   |    }


 We can set requitred events using set_event . It is same as enaabing specific event using enable.

 [tracing] # echo drm_vblank_event drm_vblank_event_delivered drm_vblank_event_queued > set_event


PERF TOOL:
One of the key secrets for quick use of tracepoints is the perf tool . CONFIG_EVENT_PROFILE configuration option should be set.

perf will be available at ./kernel/tools/perf

$ perf list -> List of events available in the system

$ perf stat -a -e kmem:kmalloc sleep 10  -> How many kmalloc() calls are happening on a system

The -a option gives whole-system results

$ perf stat -e kmem:kmalloc make  -> Monitoring allocations during the building of the perf tool




https://www.kernel.org/doc/Documentation/trace/tracepoints.txt - Kernel TracePoint
http://lwn.net/Articles/379903/
http://lwn.net/Articles/381064/
http://lwn.net/Articles/383362/
ftp://ftp.gnu.org/old-gnu/Manuals/gdb/html_chapter/gdb_10.html - GDB Tracepoints usage
https://lwn.net/Articles/346470/  - PERF TOOL