Welcome!

Blog Feed Post

Exploring Windows Kernel with Fibratus and Logsene

This is a guest post by Nedim Šabić, developer of Fibratus, a tool for exploration and tracing of the Windows kernel. 

Unlike Linux / UNIX environments which provide a plethora of open source and native tools to instrument the user / kernel space internals, the Windows operating systems are pretty limited when it comes to diversity of tools and interfaces to perform the aforementioned tasks. Prior to Windows 7, you could use some of not so legal techniques like SSDT hooking to intercept system calls issued from the user space and do your custom pre-processing, but they are far from efficient or stable. The kernel mode driver could be helpful if it wouldn’t require a digital signature granted by Microsoft. Actually, some tools like Sysmon or Process Monitor can be helpful, but they are closed-source and don’t leave much room for extensibility or integration with external systems such as message queues, databases, endpoints, etc.

Fortunately, there is still hope. Event Tracing for Windows (ETW) is the native logging infrastructure built into every major Windows subsystem. Thus, we have ETW providers that emit Ethernet frames, notifications from the USB subsystem, Active Directory logins, NTFS file system operations, just to name a few. The ETW overhead is pretty low, which makes it a perfect fit for production environments.

However, the ETW API might be the one of the ugliest and the most verbose APIs ever constructed (it was publicly recognized by Microsoft). The extraction of the properties from the event buffer is truly a nightmare. The documentation is decent (if we don’t take into account the event schemas), but it lacks some more practical examples. If you go and search the Web, you won’t find any exhaustive material about ETW when looking into it from the programmer’s perspective. If we add to this the fact that there are no implementations of ETW in dynamic programming languages like Python, which is frequently used for incident response work, reverse engineering and malware analysis, those were reasons enough for giving the inception to what now is officially called Fibratus (if you’re wondering about the name, cirrus fibratus is the cloud formation type).

Fibratus is a tool (inspired on Sysdig) for exploration and tracing of the Windows kernel which relies on the kernel logger provider to collect the operating system activity. It does the heavy lifting on abstracting the cumbersome details of the ETW into a consistent mechanism capable of capturing a wide spectrum of operations such as file system I/O, process/thread creation, network activity, context switch instrumentation, registry activity, and so on. On top on Fibratus you can execute filaments, which are micro modules written in Python programming language. They provide a non-intrusive model to extend Fibratus with your own arsenal of tools leveraging the goodness of the Python’s ecosystem. Because the way to interact with Fibratus is via CLI, it also uses the console as a primary sink to output the kernel’s activity to the end user. Besides that, there are a number of output adapters such as SMTP, AMQP, and the Elasticsearch adapter to send the kernel information through those respective transports. Let’s see how to setup Fibratus and configure it to aggregate the kernel event stream to Logsene – a log management service with Elasticsearch API.

Fibratus setup

Fibratus is hosted on PyPI. Run the following command to install it (keep in mind Fibratus requires the C compiler in order to build the Cython extension):

$ pip install fibratus

After the installation has completed successfully, you should be able to run Fibratus:

$ fibratus run

The previous command will render a bunch of output depending on your system load. You may be interested in a specific kernel events. No problem! Use the –filters option to specify the list of kernel event names:

$ fibratus run --filters CreateProcess CreateThread TerminateThread

On my machine it results in the following output:

0 17:52:31.730000 3 cmd.exe (1440) - CreateThread (base_priority=8, io_priority=2, kstack_base=0xffff9e00acbed000, pid=1440, tid=8204, ustack_base=0xc4419d0000)

1 17:52:31.830000 2 cmd.exe (1440) - TerminateThread (base_priority=8, io_priority=2, kstack_base=0xffff9e00acbed000, pid=1440, tid=8204, ustack_base=0xc4419d0000)

2 17:52:35.276000 3 <NA> (2936) - CreateProcess (comm=C:\WINDOWS\system32\cmd.exe /c dir /-C /W c:/Users/Nedo/AppData/Roaming/RabbitMQ/db/RABBIT~1, exe=C:\WINDOWS\system32\cmd.exe, name=cmd.exe, pid=7008, ppid=2936)

3 17:52:35.278000 3 cmd.exe (7008) - CreateThread (base_priority=8, io_priority=2, kstack_base=0xffff9e00a77e1000, pid=7008, tid=6972, ustack_base=0x542ef00000)

4 17:52:35.288000 3 cmd.exe (7008) - TerminateThread (base_priority=8, io_priority=2, kstack_base=0xffff9e00a77e1000, pid=7008, tid=6972, ustack_base=0x542ef00000)

5 17:52:36.112000 1 cmd.exe (2908) - CreateThread (base_priority=8, io_priority=2, kstack_base=0xffff9e00a77e1000, pid=2908, tid=2700, ustack_base=0x313c300000)

6 17:52:36.129000 0 cmd.exe (2908) - TerminateThread (base_priority=8, io_priority=2, kstack_base=0xffff9e00a77e1000, pid=2908, tid=2700, ustack_base=0x313c300000)

7 17:52:41.137000 3 <NA> (2936) - CreateProcess (comm=C:\WINDOWS\system32\cmd.exe /c handle.exe /accepteula -s -p 2936 2> nul, exe=C:\WINDOWS\system32\cmd.exe \cmd.exe, name=cmd.exe, pid=7056, ppid=2936)

8 17:52:41.139000 3 cmd.exe (7056) - CreateThread (base_priority=8, io_priority=2, kstack_base=0xffff9e00a656b000, pid=7056, tid=5524, ustack_base=0x4239740000)

...

Sometimes, it’s interesting to spy on the specific process. To do that, use the –pid flag, providing the PID of the process. Lastly, the filaments are executed by appending the –filament option next to filament name. To enumerate the available filaments use fibratus list-filaments command.

Indexing to Logsene

Once you’ve become familiar with Fibratus you will want to store and index all the output, so you can properly analyze it.  To do that we’ll configure the Elasticsearch adapter and point it to Logsene. The configuration descriptor is located in $HOME.fibratus\fibratus.yml. Open the file and provide the values for the Elasticsearch host/s, the index and document names (the credentials are optional):

output:

- elasticsearch:

     hosts:

       - logsene-receiver.sematext.com:443

    index: <your-logsene-app-token>

    document: kernel

    bulk: True

    ssl: True

If you enable the secure transport, make sure the certifi package is installed for the certificate verification to be done properly. Now comes the funny part – the actual code of the filament responsible for sending the kernel event stream to Logsene. Let’s go through it.

"""

Performs the indexing of the kernel's event stream to

Elasticsearch on interval basis. When the scheduled

interval elapses, the list of documents aggregated

are indexed to Elasticsearch.

"""

from datetime import datetime

documents = []

def on_init():

  set_filter('CreateThread', 'CreateProcess', 'TerminateThread',  'TerminateProcess','CreateFile', 'DeleteFile', 'WriteFile', 'RenameFile', 'Recv', 'Send', 'Accept', 'Connect', 'Disconnect', 'LoadImage', 'UnloadImage', 'RegCreateKey', 'RegDeleteKey', 'RegSetValue')

  set_interval(1)

def on_next_kevent(kevent):

  doco = {'image': kevent.thread.name,

          'thread': {

              'exe': kevent.thread.exe,

              'comm': kevent.thread.comm,

              'pid': kevent.thread.pid,

              'tid': kevent.tid,

              'ppid': kevent.thread.ppid},

          'category': kevent.category,

          'name': kevent.name,

          'ts': '%s %s' % (datetime.now().strftime('%m/%d/%Y'),

                           kevent.timestamp.strftime('%H:%M:%S.%f')),

          'cpuid': kevent.cpuid,

          'params': kevent.params}

  documents.append(doco)

def on_interval():

  if len(documents) > 0:

      elasticsearch.emit(documents)

      documents.clear()

Firstly, on filament’s initialization, we set the list of filters for the kernel events we want to capture. Because the indexing operation takes place periodically, we need to call the set_interval function to establish the interval in seconds. The on_next_kevent function will aggregate the kernel event payload to the documents list, which will later be consumed and indexed to Logsene when on_interval function is fired. That’s pretty straightforward. Let’s run the filament:

$ fibratus run --filament elasticsearch_indexing

If you open your Logsene application, you should see the data stream coming, like in the figure above.

image06https://sematext.com/wp-content/uploads/2016/11/image06-300x130.png 300w, https://sematext.com/wp-content/uploads/2016/11/image06-768x334.png 768w, https://sematext.com/wp-content/uploads/2016/11/image06-1024x445.png 1024w" sizes="(max-width: 1917px) 100vw, 1917px" />

Exploring the data

Logsene comes with out of the box support for histograms and top list values. Those come very handy for finding out some basic insights about the kernel and the user space activity. For example, let’s visualize the top kernel events grouped by the event name as well as by category.

image00https://sematext.com/wp-content/uploads/2016/11/image00-300x142.png 300w" sizes="(max-width: 419px) 100vw, 419px" />

image03https://sematext.com/wp-content/uploads/2016/11/image03-1-300x188.png 300w" sizes="(max-width: 431px) 100vw, 431px" />

The above charts  give us a clear breakdown of a trace captured with Fibratus. The file system and network operations took the majority of the time in this trace. The CreateFile and the WriteFile kernel events map to NtCreateFile and NtWriteFile system calls respectively. That doesn’t mean there were 7928 file creations, though. When the user space process requests to access a file or an I/O device, it issues the NtCreateFile system call that, depending on the arguments passed, will open or create the underlying file or operate on the I/O device. To narrow down the scope of the search, we can filter by params.operation field.

image07https://sematext.com/wp-content/uploads/2016/11/image07-300x118.png 300w, https://sematext.com/wp-content/uploads/2016/11/image07-768x302.png 768w, https://sematext.com/wp-content/uploads/2016/11/image07-1024x403.png 1024w" sizes="(max-width: 1432px) 100vw, 1432px" />

You can see all the files created during the duration of the trace. In my case they were various threads (note the thread.tid field) of the chrome process which created a number of temporary files, and the single-threaded sysmon process related to directories  creations. We can also build a dashboard widget to observe the progression of file creations.

image02https://sematext.com/wp-content/uploads/2016/11/image02-1-300x129.png 300w, https://sematext.com/wp-content/uploads/2016/11/image02-1-768x330.png 768w" sizes="(max-width: 832px) 100vw, 832px" />

Tracing the antivirus software

I was curious to see what the AVG antivirus was doing on my computer. All you need to do in Logsene is to fire up the wildcard search with the image field containing the avg segment.

image08https://sematext.com/wp-content/uploads/2016/11/image08-300x158.png 300w, https://sematext.com/wp-content/uploads/2016/11/image08-768x405.png 768w, https://sematext.com/wp-content/uploads/2016/11/image08-1024x540.png 1024w" sizes="(max-width: 1434px) 100vw, 1434px" />

You might expect to see a lot of file system related operations. Indeed, a large number of kernel events deal with accessing the file objects in order to find a malicious patterns. While keeping Fibratus running, the next thing I did was to launch a scan from the AVG interface. The first thing that caught my attention was a sequence of TCP packets generated by the AVG User Interface (avgui.exe) as well as AVG WatchDog Service (avgwdsvca.exe) process.

image04https://sematext.com/wp-content/uploads/2016/11/image04-300x40.png 300w, https://sematext.com/wp-content/uploads/2016/11/image04-768x102.png 768w, https://sematext.com/wp-content/uploads/2016/11/image04-1024x136.png 1024w" sizes="(max-width: 1379px) 100vw, 1379px" />

Those packets were encapsulating HTTP requests to the  95.100.126.85 IP address. If you perform an IPWHOIS lookup it will reveal  Akamai Technologies is behind that address. Hmm, those might be the servers from where the antivirus is pulling for updates.

Tracing threads and registry I/O

Next, I wanted to see the thread and the registry I/O activity. A huge number of threads were created, mostly the ones associated with the user interface processes.

image05https://sematext.com/wp-content/uploads/2016/11/image05-300x97.png 300w, https://sematext.com/wp-content/uploads/2016/11/image05-768x249.png 768w, https://sematext.com/wp-content/uploads/2016/11/image05-1024x332.png 1024w" sizes="(max-width: 1381px) 100vw, 1381px" />

I didn’t find much registry activity, except the AVG Watchdog Service setting the SOFTWARE\AVG\Zen\ConfigData\ZenMainWindowOpen binary value and the avgui process trying to create the Software\Microsoft\Windows\CurrentVersion\Internet Settings\Connections registry key.

image01https://sematext.com/wp-content/uploads/2016/11/image01-1-300x113.png 300w, https://sematext.com/wp-content/uploads/2016/11/image01-1-768x288.png 768w" sizes="(max-width: 925px) 100vw, 925px" />

We just scratched the surface of what’s possible when digging deeper into the sequence of kernel events emitted from Fibratus and combining the powerful analyzing capabilities of the Logsene platform.

SIGN UP – FREE TRIAL

Read the original blog entry...

More Stories By Sematext Blog

Sematext is a globally distributed organization that builds innovative Cloud and On Premises solutions for performance monitoring, alerting and anomaly detection (SPM), log management and analytics (Logsene), and search analytics (SSA). We also provide Search and Big Data consulting services and offer 24/7 production support for Solr and Elasticsearch.

Latest Stories
"We are a modern development application platform and we have a suite of products that allow you to application release automation, we do version control, and we do application life cycle management," explained Flint Brenton, CEO of CollabNet, in this SYS-CON.tv interview at DevOps at 19th Cloud Expo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
The 20th International Cloud Expo has announced that its Call for Papers is open. Cloud Expo, to be held June 6-8, 2017, at the Javits Center in New York City, brings together Cloud Computing, Big Data, Internet of Things, DevOps, Containers, Microservices and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportunity. Submit your speaking proposal ...
You have great SaaS business app ideas. You want to turn your idea quickly into a functional and engaging proof of concept. You need to be able to modify it to meet customers' needs, and you need to deliver a complete and secure SaaS application. How could you achieve all the above and yet avoid unforeseen IT requirements that add unnecessary cost and complexity? You also want your app to be responsive in any device at any time. In his session at 19th Cloud Expo, Mark Allen, General Manager of...
The Internet of Things (IoT) promises to simplify and streamline our lives by automating routine tasks that distract us from our goals. This promise is based on the ubiquitous deployment of smart, connected devices that link everything from industrial control systems to automobiles to refrigerators. Unfortunately, comparatively few of the devices currently deployed have been developed with an eye toward security, and as the DDoS attacks of late October 2016 have demonstrated, this oversight can ...
"ReadyTalk is an audio and web video conferencing provider. We've really come to embrace WebRTC as the platform for our future of technology," explained Dan Cunningham, CTO of ReadyTalk, in this SYS-CON.tv interview at WebRTC Summit at 19th Cloud Expo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
Major trends and emerging technologies – from virtual reality and IoT, to Big Data and algorithms – are helping organizations innovate in the digital era. However, to create real business value, IT must think beyond the ‘what’ of digital transformation to the ‘how’ to harness emerging trends, innovation and disruption. Architecture is the key that underpins and ties all these efforts together. In the digital age, it’s important to invest in architecture, extend the enterprise footprint to the cl...
Keeping pace with advancements in software delivery processes and tooling is taxing even for the most proficient organizations. Point tools, platforms, open source and the increasing adoption of private and public cloud services requires strong engineering rigor – all in the face of developer demands to use the tools of choice. As Agile has settled in as a mainstream practice, now DevOps has emerged as the next wave to improve software delivery speed and output. To make DevOps work, organization...
Bert Loomis was a visionary. This general session will highlight how Bert Loomis and people like him inspire us to build great things with small inventions. In their general session at 19th Cloud Expo, Harold Hannon, Architect at IBM Bluemix, and Michael O'Neill, Strategic Business Development at Nvidia, discussed the accelerating pace of AI development and how IBM Cloud and NVIDIA are partnering to bring AI capabilities to "every day," on-demand. They also reviewed two "free infrastructure" pr...
"Dice has been around for the last 20 years. We have been helping tech professionals find new jobs and career opportunities," explained Manish Dixit, VP of Product and Engineering at Dice, in this SYS-CON.tv interview at 19th Cloud Expo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
More and more brands have jumped on the IoT bandwagon. We have an excess of wearables – activity trackers, smartwatches, smart glasses and sneakers, and more that track seemingly endless datapoints. However, most consumers have no idea what “IoT” means. Creating more wearables that track data shouldn't be the aim of brands; delivering meaningful, tangible relevance to their users should be. We're in a period in which the IoT pendulum is still swinging. Initially, it swung toward "smart for smar...
Rapid innovation, changing business landscapes, and new IT demands force businesses to make changes quickly. In the eyes of many, containers are at the brink of becoming a pervasive technology in enterprise IT to accelerate application delivery. In this presentation, attendees learned about the: The transformation of IT to a DevOps, microservices, and container-based architecture What are containers and how DevOps practices can operate in a container-based environment A demonstration of how ...
Extracting business value from Internet of Things (IoT) data doesn’t happen overnight. There are several requirements that must be satisfied, including IoT device enablement, data analysis, real-time detection of complex events and automated orchestration of actions. Unfortunately, too many companies fall short in achieving their business goals by implementing incomplete solutions or not focusing on tangible use cases. In his general session at @ThingsExpo, Dave McCarthy, Director of Products...
Successful digital transformation requires new organizational competencies and capabilities. Research tells us that the biggest impediment to successful transformation is human; consequently, the biggest enabler is a properly skilled and empowered workforce. In the digital age, new individual and collective competencies are required. In his session at 19th Cloud Expo, Bob Newhouse, CEO and founder of Agilitiv, drew together recent research and lessons learned from emerging and established compa...
Without a clear strategy for cost control and an architecture designed with cloud services in mind, costs and operational performance can quickly get out of control. To avoid multiple architectural redesigns requires extensive thought and planning. Boundary (now part of BMC) launched a new public-facing multi-tenant high resolution monitoring service on Amazon AWS two years ago, facing challenges and learning best practices in the early days of the new service. In his session at 19th Cloud Exp...
"Venafi has a platform that allows you to manage, centralize and automate the complete life cycle of keys and certificates within the organization," explained Gina Osmond, Sr. Field Marketing Manager at Venafi, in this SYS-CON.tv interview at DevOps at 19th Cloud Expo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.