File access optimization discovery and visualization

Fast and Lean

Functions Altering the Offset

The initial opening of a file sets the offset of the newly opened file handle at position zero. Calling a function that closes an active file handle removes the cursor from the file. Three types of functions change the offset according to a given parameter: seek, write, and read. The seek function jumps to a desired spot inside a file, changing the offset associated with the file handle, whereas write and read functions move the cursor position from its previous position.

Read functions read a specified number of bytes beginning from the current cursor position. After the function call, the offset increases by the number of bytes read. When you write bytes into a file, it inserts the content at the current cursor position. Like read functions, it moves the cursor to the end of the written block. In this case, the new offset is the number of written bytes added to the previous offset. At the same time, a write function overwrites the content in its affected byte range or increases the file size when it reaches the end of the file.

Possible Cases

By keeping these offset-altering functions in mind, you can determine whether and how multiple active file accesses in the same file affect one another.

To structure all possibilities, we examine each potential repeated function type and analyze the different calls that can occur between them. When combining calls, we assume the application may allocate more memory for buffering file access.

Repeated open or close function calls will be excluded, because they do not typically happen. Multiple seek function calls that happen one after another are most possibly a coding issue, because a seek where no different function happens afterward has no use, and the first call is obsolete.

Repeated Reads

Read functions only read a sequence of bytes without modifying the file itself; therefore, in many cases, you can merge two sequential reads into one, which is always the case if the intermediate function calls do not change the file as well. A seek call or another read call between the repeated function calls are cases in which you can easily merge the two read function calls and perform it before or after the function in between.

However, if a write call from another file access occurs on the file, you need to consider the byte area affected by the calls. Figure 3 shows a second read call of the first file access reading the bytes written by another file access that occurs after the first read call, which prevents simply merging the second call into the first because it would change the read result.

Figure 3: A write function call interfering with repeated read function calls.

A write call that occurs within the read area between two read calls affects the content of the second read call, preventing the two reads from being merged. However, if bytes are written behind the read areas, the areas are non-overlapping, allowing a merge of two read calls and performing the new call before or after the write call (Figure 4).

Figure 4: Repeated read function calls with a non-interfering write in between can be optimizied.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus