Welcome!

Blog Feed Post

Exploring Windows Kernel with Fibratus and Logsene

This is a guest post by Nedim Šabić, developer of Fibratus, a tool for exploration and tracing of the Windows kernel. 

Unlike Linux / UNIX environments which provide a plethora of open source and native tools to instrument the user / kernel space internals, the Windows operating systems are pretty limited when it comes to diversity of tools and interfaces to perform the aforementioned tasks. Prior to Windows 7, you could use some of not so legal techniques like SSDT hooking to intercept system calls issued from the user space and do your custom pre-processing, but they are far from efficient or stable. The kernel mode driver could be helpful if it wouldn’t require a digital signature granted by Microsoft. Actually, some tools like Sysmon or Process Monitor can be helpful, but they are closed-source and don’t leave much room for extensibility or integration with external systems such as message queues, databases, endpoints, etc.

Fortunately, there is still hope. Event Tracing for Windows (ETW) is the native logging infrastructure built into every major Windows subsystem. Thus, we have ETW providers that emit Ethernet frames, notifications from the USB subsystem, Active Directory logins, NTFS file system operations, just to name a few. The ETW overhead is pretty low, which makes it a perfect fit for production environments.

However, the ETW API might be the one of the ugliest and the most verbose APIs ever constructed (it was publicly recognized by Microsoft). The extraction of the properties from the event buffer is truly a nightmare. The documentation is decent (if we don’t take into account the event schemas), but it lacks some more practical examples. If you go and search the Web, you won’t find any exhaustive material about ETW when looking into it from the programmer’s perspective. If we add to this the fact that there are no implementations of ETW in dynamic programming languages like Python, which is frequently used for incident response work, reverse engineering and malware analysis, those were reasons enough for giving the inception to what now is officially called Fibratus (if you’re wondering about the name, cirrus fibratus is the cloud formation type).

Fibratus is a tool (inspired on Sysdig) for exploration and tracing of the Windows kernel which relies on the kernel logger provider to collect the operating system activity. It does the heavy lifting on abstracting the cumbersome details of the ETW into a consistent mechanism capable of capturing a wide spectrum of operations such as file system I/O, process/thread creation, network activity, context switch instrumentation, registry activity, and so on. On top on Fibratus you can execute filaments, which are micro modules written in Python programming language. They provide a non-intrusive model to extend Fibratus with your own arsenal of tools leveraging the goodness of the Python’s ecosystem. Because the way to interact with Fibratus is via CLI, it also uses the console as a primary sink to output the kernel’s activity to the end user. Besides that, there are a number of output adapters such as SMTP, AMQP, and the Elasticsearch adapter to send the kernel information through those respective transports. Let’s see how to setup Fibratus and configure it to aggregate the kernel event stream to Logsene – a log management service with Elasticsearch API.

Fibratus setup

Fibratus is hosted on PyPI. Run the following command to install it (keep in mind Fibratus requires the C compiler in order to build the Cython extension):

$ pip install fibratus

After the installation has completed successfully, you should be able to run Fibratus:

$ fibratus run

The previous command will render a bunch of output depending on your system load. You may be interested in a specific kernel events. No problem! Use the –filters option to specify the list of kernel event names:

$ fibratus run --filters CreateProcess CreateThread TerminateThread

On my machine it results in the following output:

0 17:52:31.730000 3 cmd.exe (1440) - CreateThread (base_priority=8, io_priority=2, kstack_base=0xffff9e00acbed000, pid=1440, tid=8204, ustack_base=0xc4419d0000)

1 17:52:31.830000 2 cmd.exe (1440) - TerminateThread (base_priority=8, io_priority=2, kstack_base=0xffff9e00acbed000, pid=1440, tid=8204, ustack_base=0xc4419d0000)

2 17:52:35.276000 3 <NA> (2936) - CreateProcess (comm=C:\WINDOWS\system32\cmd.exe /c dir /-C /W c:/Users/Nedo/AppData/Roaming/RabbitMQ/db/RABBIT~1, exe=C:\WINDOWS\system32\cmd.exe, name=cmd.exe, pid=7008, ppid=2936)

3 17:52:35.278000 3 cmd.exe (7008) - CreateThread (base_priority=8, io_priority=2, kstack_base=0xffff9e00a77e1000, pid=7008, tid=6972, ustack_base=0x542ef00000)

4 17:52:35.288000 3 cmd.exe (7008) - TerminateThread (base_priority=8, io_priority=2, kstack_base=0xffff9e00a77e1000, pid=7008, tid=6972, ustack_base=0x542ef00000)

5 17:52:36.112000 1 cmd.exe (2908) - CreateThread (base_priority=8, io_priority=2, kstack_base=0xffff9e00a77e1000, pid=2908, tid=2700, ustack_base=0x313c300000)

6 17:52:36.129000 0 cmd.exe (2908) - TerminateThread (base_priority=8, io_priority=2, kstack_base=0xffff9e00a77e1000, pid=2908, tid=2700, ustack_base=0x313c300000)

7 17:52:41.137000 3 <NA> (2936) - CreateProcess (comm=C:\WINDOWS\system32\cmd.exe /c handle.exe /accepteula -s -p 2936 2> nul, exe=C:\WINDOWS\system32\cmd.exe \cmd.exe, name=cmd.exe, pid=7056, ppid=2936)

8 17:52:41.139000 3 cmd.exe (7056) - CreateThread (base_priority=8, io_priority=2, kstack_base=0xffff9e00a656b000, pid=7056, tid=5524, ustack_base=0x4239740000)

...

Sometimes, it’s interesting to spy on the specific process. To do that, use the –pid flag, providing the PID of the process. Lastly, the filaments are executed by appending the –filament option next to filament name. To enumerate the available filaments use fibratus list-filaments command.

Indexing to Logsene

Once you’ve become familiar with Fibratus you will want to store and index all the output, so you can properly analyze it.  To do that we’ll configure the Elasticsearch adapter and point it to Logsene. The configuration descriptor is located in $HOME.fibratus\fibratus.yml. Open the file and provide the values for the Elasticsearch host/s, the index and document names (the credentials are optional):

output:

- elasticsearch:

     hosts:

       - logsene-receiver.sematext.com:443

    index: <your-logsene-app-token>

    document: kernel

    bulk: True

    ssl: True

If you enable the secure transport, make sure the certifi package is installed for the certificate verification to be done properly. Now comes the funny part – the actual code of the filament responsible for sending the kernel event stream to Logsene. Let’s go through it.

"""

Performs the indexing of the kernel's event stream to

Elasticsearch on interval basis. When the scheduled

interval elapses, the list of documents aggregated

are indexed to Elasticsearch.

"""

from datetime import datetime

documents = []

def on_init():

  set_filter('CreateThread', 'CreateProcess', 'TerminateThread',  'TerminateProcess','CreateFile', 'DeleteFile', 'WriteFile', 'RenameFile', 'Recv', 'Send', 'Accept', 'Connect', 'Disconnect', 'LoadImage', 'UnloadImage', 'RegCreateKey', 'RegDeleteKey', 'RegSetValue')

  set_interval(1)

def on_next_kevent(kevent):

  doco = {'image': kevent.thread.name,

          'thread': {

              'exe': kevent.thread.exe,

              'comm': kevent.thread.comm,

              'pid': kevent.thread.pid,

              'tid': kevent.tid,

              'ppid': kevent.thread.ppid},

          'category': kevent.category,

          'name': kevent.name,

          'ts': '%s %s' % (datetime.now().strftime('%m/%d/%Y'),

                           kevent.timestamp.strftime('%H:%M:%S.%f')),

          'cpuid': kevent.cpuid,

          'params': kevent.params}

  documents.append(doco)

def on_interval():

  if len(documents) > 0:

      elasticsearch.emit(documents)

      documents.clear()

Firstly, on filament’s initialization, we set the list of filters for the kernel events we want to capture. Because the indexing operation takes place periodically, we need to call the set_interval function to establish the interval in seconds. The on_next_kevent function will aggregate the kernel event payload to the documents list, which will later be consumed and indexed to Logsene when on_interval function is fired. That’s pretty straightforward. Let’s run the filament:

$ fibratus run --filament elasticsearch_indexing

If you open your Logsene application, you should see the data stream coming, like in the figure above.

image06https://sematext.com/wp-content/uploads/2016/11/image06-300x130.png 300w, https://sematext.com/wp-content/uploads/2016/11/image06-768x334.png 768w, https://sematext.com/wp-content/uploads/2016/11/image06-1024x445.png 1024w" sizes="(max-width: 1917px) 100vw, 1917px" />

Exploring the data

Logsene comes with out of the box support for histograms and top list values. Those come very handy for finding out some basic insights about the kernel and the user space activity. For example, let’s visualize the top kernel events grouped by the event name as well as by category.

image00https://sematext.com/wp-content/uploads/2016/11/image00-300x142.png 300w" sizes="(max-width: 419px) 100vw, 419px" />

image03https://sematext.com/wp-content/uploads/2016/11/image03-1-300x188.png 300w" sizes="(max-width: 431px) 100vw, 431px" />

The above charts  give us a clear breakdown of a trace captured with Fibratus. The file system and network operations took the majority of the time in this trace. The CreateFile and the WriteFile kernel events map to NtCreateFile and NtWriteFile system calls respectively. That doesn’t mean there were 7928 file creations, though. When the user space process requests to access a file or an I/O device, it issues the NtCreateFile system call that, depending on the arguments passed, will open or create the underlying file or operate on the I/O device. To narrow down the scope of the search, we can filter by params.operation field.

image07https://sematext.com/wp-content/uploads/2016/11/image07-300x118.png 300w, https://sematext.com/wp-content/uploads/2016/11/image07-768x302.png 768w, https://sematext.com/wp-content/uploads/2016/11/image07-1024x403.png 1024w" sizes="(max-width: 1432px) 100vw, 1432px" />

You can see all the files created during the duration of the trace. In my case they were various threads (note the thread.tid field) of the chrome process which created a number of temporary files, and the single-threaded sysmon process related to directories  creations. We can also build a dashboard widget to observe the progression of file creations.

image02https://sematext.com/wp-content/uploads/2016/11/image02-1-300x129.png 300w, https://sematext.com/wp-content/uploads/2016/11/image02-1-768x330.png 768w" sizes="(max-width: 832px) 100vw, 832px" />

Tracing the antivirus software

I was curious to see what the AVG antivirus was doing on my computer. All you need to do in Logsene is to fire up the wildcard search with the image field containing the avg segment.

image08https://sematext.com/wp-content/uploads/2016/11/image08-300x158.png 300w, https://sematext.com/wp-content/uploads/2016/11/image08-768x405.png 768w, https://sematext.com/wp-content/uploads/2016/11/image08-1024x540.png 1024w" sizes="(max-width: 1434px) 100vw, 1434px" />

You might expect to see a lot of file system related operations. Indeed, a large number of kernel events deal with accessing the file objects in order to find a malicious patterns. While keeping Fibratus running, the next thing I did was to launch a scan from the AVG interface. The first thing that caught my attention was a sequence of TCP packets generated by the AVG User Interface (avgui.exe) as well as AVG WatchDog Service (avgwdsvca.exe) process.

image04https://sematext.com/wp-content/uploads/2016/11/image04-300x40.png 300w, https://sematext.com/wp-content/uploads/2016/11/image04-768x102.png 768w, https://sematext.com/wp-content/uploads/2016/11/image04-1024x136.png 1024w" sizes="(max-width: 1379px) 100vw, 1379px" />

Those packets were encapsulating HTTP requests to the  95.100.126.85 IP address. If you perform an IPWHOIS lookup it will reveal  Akamai Technologies is behind that address. Hmm, those might be the servers from where the antivirus is pulling for updates.

Tracing threads and registry I/O

Next, I wanted to see the thread and the registry I/O activity. A huge number of threads were created, mostly the ones associated with the user interface processes.

image05https://sematext.com/wp-content/uploads/2016/11/image05-300x97.png 300w, https://sematext.com/wp-content/uploads/2016/11/image05-768x249.png 768w, https://sematext.com/wp-content/uploads/2016/11/image05-1024x332.png 1024w" sizes="(max-width: 1381px) 100vw, 1381px" />

I didn’t find much registry activity, except the AVG Watchdog Service setting the SOFTWARE\AVG\Zen\ConfigData\ZenMainWindowOpen binary value and the avgui process trying to create the Software\Microsoft\Windows\CurrentVersion\Internet Settings\Connections registry key.

image01https://sematext.com/wp-content/uploads/2016/11/image01-1-300x113.png 300w, https://sematext.com/wp-content/uploads/2016/11/image01-1-768x288.png 768w" sizes="(max-width: 925px) 100vw, 925px" />

We just scratched the surface of what’s possible when digging deeper into the sequence of kernel events emitted from Fibratus and combining the powerful analyzing capabilities of the Logsene platform.

SIGN UP – FREE TRIAL

Read the original blog entry...

More Stories By Sematext Blog

Sematext is a globally distributed organization that builds innovative Cloud and On Premises solutions for performance monitoring, alerting and anomaly detection (SPM), log management and analytics (Logsene), and search analytics (SSA). We also provide Search and Big Data consulting services and offer 24/7 production support for Solr and Elasticsearch.

Latest Stories
You know you need the cloud, but you’re hesitant to simply dump everything at Amazon since you know that not all workloads are suitable for cloud. You know that you want the kind of ease of use and scalability that you get with public cloud, but your applications are architected in a way that makes the public cloud a non-starter. You’re looking at private cloud solutions based on hyperconverged infrastructure, but you’re concerned with the limits inherent in those technologies.
SYS-CON Events announced today that MobiDev, a client-oriented software development company, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. MobiDev is a software company that develops and delivers turn-key mobile apps, websites, web services, and complex software systems for startups and enterprises. Since 2009 it has grown from a small group of passionate engineers and business...
Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more business becomes digital the more stakeholders are interested in this data including how it relates to business. Some of these people have never used a monitoring tool before. They have a question on their mind like “How is my application doing” but no id...
What's the role of an IT self-service portal when you get to continuous delivery and Infrastructure as Code? This general session showed how to create the continuous delivery culture and eight accelerators for leading the change. Don Demcsak is a DevOps and Cloud Native Modernization Principal for Dell EMC based out of New Jersey. He is a former, long time, Microsoft Most Valuable Professional, specializing in building and architecting Application Delivery Pipelines for hybrid legacy, and cloud ...
With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend 21st Cloud Expo October 31 - November 2, 2017, at the Santa Clara Convention Center, CA, and June 12-14, 2018, at the Javits Center in New York City, NY, and learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.
21st International Cloud Expo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy. Me...
Join us at Cloud Expo June 6-8 to find out how to securely connect your cloud app to any cloud or on-premises data source – without complex firewall changes. More users are demanding access to on-premises data from their cloud applications. It’s no longer a “nice-to-have” but an important differentiator that drives competitive advantages. It’s the new “must have” in the hybrid era. Users want capabilities that give them a unified view of the data to get closer to customers and grow business. The...
"We focus on composable infrastructure. Composable infrastructure has been named by companies like Gartner as the evolution of the IT infrastructure where everything is now driven by software," explained Bruno Andrade, CEO and Founder of HTBase, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
SYS-CON Events announced today that Cloud Academy named "Bronze Sponsor" of 21st International Cloud Expo which will take place October 31 - November 2, 2017 at the Santa Clara Convention Center in Santa Clara, CA. Cloud Academy is the industry’s most innovative, vendor-neutral cloud technology training platform. Cloud Academy provides continuous learning solutions for individuals and enterprise teams for Amazon Web Services, Microsoft Azure, Google Cloud Platform, and the most popular cloud com...
"We are a monitoring company. We work with Salesforce, BBC, and quite a few other big logos. We basically provide monitoring for them, structure for their cloud services and we fit into the DevOps world" explained David Gildeh, Co-founder and CEO of Outlyer, in this SYS-CON.tv interview at DevOps Summit at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
Automation is enabling enterprises to design, deploy, and manage more complex, hybrid cloud environments. Yet the people who manage these environments must be trained in and understanding these environments better than ever before. A new era of analytics and cognitive computing is adding intelligence, but also more complexity, to these cloud environments. How smart is your cloud? How smart should it be? In this power panel at 20th Cloud Expo, moderated by Conference Chair Roger Strukhoff, paneli...
DevOps at Cloud Expo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 21st Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to w...
The current age of digital transformation means that IT organizations must adapt their toolset to cover all digital experiences, beyond just the end users’. Today’s businesses can no longer focus solely on the digital interactions they manage with employees or customers; they must now contend with non-traditional factors. Whether it's the power of brand to make or break a company, the need to monitor across all locations 24/7, or the ability to proactively resolve issues, companies must adapt to...
Cloud Expo, Inc. has announced today that Andi Mann and Aruna Ravichandran have been named Co-Chairs of @DevOpsSummit at Cloud Expo Silicon Valley which will take place Oct. 31-Nov. 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. "DevOps is at the intersection of technology and business-optimizing tools, organizations and processes to bring measurable improvements in productivity and profitability," said Aruna Ravichandran, vice president, DevOps product and solutions marketing...
The current age of digital transformation means that IT organizations must adapt their toolset to cover all digital experiences, beyond just the end users’. Today’s businesses can no longer focus solely on the digital interactions they manage with employees or customers; they must now contend with non-traditional factors. Whether it's the power of brand to make or break a company, the need to monitor across all locations 24/7, or the ability to proactively resolve issues, companies must adapt to...