Let's Share some thoughts.............: device driver

Showing posts with label device driver. Show all posts

Sunday, March 16, 2014

Cache Management in ARM

What is Cache?

Cache is a place where a processor can put data and instruction.

Why we need Cache?

As we know accessing memory to fetch instructions and data is way slower than processor clock, so fetching any instruction from the memory can take multiple clock cycle. To improve this scenario, ARM provides the concept of Cache, it is a small, fast block of memory which sits between processor core and memory. It holds copies of items in main memory. That way we have kept one intermediate small memory which is faster than main memory.

How it works?

If we have cache in our system, there must be a cahce controller. Cache controller is a hardware which manages cache without the knowledge of programmer. So whenever processor wants to read or write anything, first it will check it in the cache, this is called cache lookup. If it is there it will return the result to processor. If required instruction/data is not there, it is known as cache miss. And request will be forwarded to main memory.
Apart from checking whether data/instruction existence in the cache, there is one more thing in core, which is a write buffer. Write buffer is used to save further time for processor. So suppose processor wants to execute a store instruction, what it can do is, it can put the relevant information such as which location to write, which buffer needs to copy, and what is the size of buffer which is getting copied into this write buffer, after that processor is free to do take next instruction, Write buffer will then put this data into the memory.

Level of Cache:
There are generally two caches(L1 and L2), some architecture have only one cahce(L1) too.
L1 caches are typically connected directly to the core logic that fetches instructions and handles load and store instructions. These are Harvard caches, that is, there are separate caches for instructions and for data that effectively appear as part of the core. Generally these have very small size 16KB or 32KB. This size depends on the capability of providing single cycle access at a core speed of 1GHz or
more.
Other is L2 cache, these are larger in size, but slower and unified in nature. Unified because it uses single cahce for instructions and data. L2 cache could be part of core or be a external block.

Cache Coherency:
As every core has its own L1 cache, It needs a mechanism to maintain the coherency between all the caches, because if one cache is updated and other is not, This will create data inconsistency.
There are 2 ways to solve this scenario:

1. Hardware manage coherency : It is the most efficient solution. So the data which is shared among caches will always be updated. Everything in that sharing domain will always contain the same exact value for that share data.
2. Software manager coherency : Here the software, usually device drivers, must clean or flush dirty data from caches, and invalidate old data to enable sharing with other processors or masters in the system. This takes processor cycles, bus bandwidth, and power.

Wednesday, March 12, 2014

why to use request_threaded_irq?

I know for many this function may be very clear, but I found it really difficult to digest this function. So lets start it :

Why we need it ?
From the starting, it was desired for kernel to reduce the time for which processor stays in interrupt context. To solve this initially they introduced Top half and bottom half concept, in which they keep time critical task in top half and rest of the work they kept in bottom half. When processor is executing the interrupt handler it is in interrupt context with interrupts disabled on that line, which is not good because if it is a shared line, other interrupts won't be handled during this time, which in turn effect the overall system latency.
To overcome this problem, kernel developers came up with the request_threaded_irq() method which futher reduces the time.

How it works?
Before going further with the functionality, lets check the function definition first

`int request_threaded_irq (`	unsigned int `irq`,
	irq_handler_t `handler`,
	irq_handler_t `thread_fn`,
	unsigned long `irqflags`,
	const char * `devname`,
	void * `dev_id)`;

Difference between this function and usual request_irq function is the extra thread_fn.

Now lets understand the functionality, request_threaded_irq() breaks handler code in two parts, 1st handler and 2nd thread function. Now main functionality of handler is to intimate Hardware that it has received the interrupt and wake up thread function. As soon as handler finishes, processor is in process context.

so processor is free to receive new interrupts. It improves the overall latency.

Misc:

Some driver code uses request_threaded_irq() with NULL as a value for handler, in that scenario kernel will invoke the default handler, which simply wakeup the thread function.

When to use request_threaded_irq instead of bottom halves ?

Answer lies in the driver's requirement, if it wants to sleep put the code in thread fn and user threaded function.

Thursday, March 6, 2014

DMA : Direct Memory Access

Why we use it ?

In normal scenario whenever data transfer is done, processor should be informed. If there is more data, it could lead a lot of time of CPU. To overcome this issue, DMA concept came in picture, using DMA, computer can access system memory without telling Processor.

Earlier when DMA was not there, CPU will be busy when transfer is happening, but now with DMA, CPU can do other operations while transfer is happening.

Misc:

DMA has its configuration, which describe what it will be transfering, from where it will be taking data. what is the size? There are many other settings too. There will be one internal buffer also, which will be taking data from source and then putting it at destination.

When we want to start the transfer, there will be one bit associated, when you write it, it will intiate the transfer.

Thursday, February 27, 2014

I2C Communicatiom

I2C (Inter-Integrated Circuit) protocol is used to transfer data between Master and Slave device. There are 2 lines between master and slave, one for data and other for clock.

I2C is synchronous transfer mechanism, as it uses clock to control the data. So for a particular transfer, Master/ Slave will be transferring/recieving at the same rate.

Master is known as master, because only it is who can initiate the transfer.

There are two type in which you can register a slave device, either it is connected to interrupt controller, or through GPIO lines you can use it. When registering to interrupt controller we use "i2c" while in other case we use it as "i2c-gpio".

In device side, we define i2c_board_info for every slave, this will containg slave name, slave address, also platform data. Futher this slave will be register using i2c_register_board_info(), in this method we will pass the bus number and the slave.

Flow will differ if we are using gpio, there we will create device as name "i2c-gpio" and bus number, it will pass this device in platform_add_devices() to get it registered.

At driver side, we will be define all functions like probe, remove, NAME(it should be same), pm_ops. Finally we will call i2c_add_driver(). So as soon as linking between device and driver happens, it will call probe function.

Now lets take example of Touch, there whenever user touches screen, it will generate the interrupt(it was registered in probe). Through this only Master(touch screen driver) will know that there is some data available at the slave side which needs to be read. Then it will initiate the connection and read data one by one. same is applicale for reading, a little bit flow changes for it.

Wednesday, June 19, 2013

Device driver diary : my notes

Virtual device are devices which does not have real hardware, these are used to provide certain functionality.

Character devices are used to provide interface to platform to access hardware through certain file operations.

Platform devices are used to put certain functionality inside the device, so that other devices(virtual or physical) can use its functionality.

How init function of each driver is getting called ?
Ans : When we compile the module code, it puts the function pointer of that module init in a file called init.txt. So at the running time, in main.c file a function
reads all of these function pointer and call those one by one.

- device_create() is used to create device, but we need to tell whether it is platform device or character device. Character device do that using cdev_add or cdev_init.
platform device use platform_device_register().

- Major and Minor combination is used identify the device driver by device file.so whenever we create the device as well as add the device to device hierarchy we use this number.

Character Device:
Refere this link : http://www.linuxforu.com/2011/02/linux-character-drivers/

Device Driver Diary : Uevent and Netlink Socket for Kernel-Platform Communication

uevent is a kernel state change notification method. So whenever there is any change in hardware, which needs to be conveyed to platform,platform uses uevent to send those. There will be a daemon from platform side which will monitors these events and respond accordingly.

So uevent is nothing but a notification which consists detailed information about this event. For example :
add@/devices/pci0000:00/0000:00:1d.0/usb2/2-1/2-1.2/2-1.2:1.0/host6 ACTION=add DEVPATH=/devices/pci0000:00/0000:00:1d.0/usb2/2-1/2-1.2/2-1.2:1.0/host6 SUBSYSTEM=scsi DEVTYPE=scsi_host SEQNUM=1165
This tells many things like what is the device name, device id, subsystem, device type, action(add,remove).

so lets see how it adds the variable info:
to do the same it needs a method add_uevent_var().Here we put the information. for example :

env = kzalloc (sizeof (struct kobj_uevent_env), GFP_KERNEL);
retval = add_uevent_var (env, “ACTION =% s”, action_string);
retval = add_uevent_var (env, “DEVPATH =% s”, devpath);
retval = add_uevent_var (env, “SUBSYSTEM =% s”, subsystem);

now as string is ready, we will call
kobject_uevent_env(struct kobject *kobj, enum kobject_action action, char *envp_ext[]);
/*for usb specific*/
kobject_uevent_env(&dev->dev->kobj, KOBJ_CHANGE,disconnected);
So if you want to send any event to platform, you have to use this method ultimately.

Now lets understand a little bit insight of how kobject_event_env() works. Basically this method uses the netlink socket inside.Netlink sockets are the socket which is used to intiate connection from kernel side. It works like this.

1. We create a socket at kernel side, and associate a buffer with this buffer.
2. It calls netlink_broadcast_filtered()
3. It calls do_one_braodcast()
4. It calls netlink_broadcast_deliver() which puts socket into a socket buffer queue. Also it calls sk->dk_data_ready()
5. finally it calls sock_def_readable, it signals the user side socket that the data at kernel side is ready to be read.

At Platform side we have a file at hardware/libhardware_legacy/uevent/ named as uevent.c. This have a function, uvent_init(), it creates the netlink socket at platform side using socket(), and associate the local buffer to it using bind() function. Then in uevent_next_event() function,it will keep on checking the socket for new data. uevent_next_event() function will be called by application.

Important Files :

1. hardware/libhardware_legacy/uevent/uevent.c
2. kernel/lib/kobject_uevent.c
3. kernel/net/netlink/af_netlink.c
4. kernel/drivers/usb/gadget/android.c [for usb related uevents]
5. frameworks/base/core/java/android/os/UEventObserver.java
6. frameworks/base/services/java/com/android/server/usb/UsbService.java

References :
1. http://blog.csdn.net/dfysy/article/details/7330919 [translate it, it will help you a lot about how uevent flow]
2. http://www.linuxjournal.com/article/7356 [complete tutorial on netlink socket, with example]
3. Man page of socket(), bind() and netlink() @ http://linux.die.net

I hope this will help you. Still I have one doubt that how netlink socket is different from normall polling, if you have clue, please comment below.

EDIT : Mr. Rami Rosen told me that its the generic kernel implementation that when a write operation is done on the queue of the socket, a flag which says that the socket can be awakened is set. So until then poll is in blocking state.

Relation between device registration and driver registration

Hi all,

Today I observe one interesting thing during calling platform_driver_register(). The normal thing is if you have a device, and also a driver, It will be linked. But suppose you want to do some different functionality with the device, like you want to do some initialization of few variables before starting the real functionality of the device. So a nice way to do this is create multiple device with the same name but different id and register those usingplatform_device_register() or platform_add_devices().

So now when you will call platform_driver_register(), it will check for all devices available with the same name and the same number of probe() function will be called. Here in probe function you can put your logic for initialization according to your id.

Hope it will help you.

Let's Share some thoughts.............