Welcome!

Related Topics: Java IoT

Java IoT: Article

Java Streams Basics

Lesson 5

Most of the programs work with external data stored either in local files or coming from other computers on the network. Java has a concept of working with so-called streams of data. After a physical data storage is mapped to a logical stream, a Java program reads data from this stream serially - byte after byte, character after character, etc. Some of the types of streams are byte streams (InputStream, OutputStream) and character streams (Reader and Writer). The same physical file could be read using different types of streams, for example, FileInputStream, or FileReader.

Classes that work with streams are located in the package java.io. Java 1.4 has introduced the new package java.nio with improved performance, which is not covered in this lesson.

There are different types of data, and hence different types of streams.

Here's the sequence of steps needed to work with a stream:

  1. Open a stream that points at a specific data source: a file, a socket, URL, etc.
  2. Read or write data from/to this stream.
  3. Close the stream.

Let's have a closer look at some of the Java streams.

Byte Streams

If a program needs to read/write bytes (8-bit data), it could use one of the subclasses of the InputStream or OutputStream respectively. The example below shows how to use the class FileInputStream to read a file named abc.dat. This code snippet prints each byte's value separated with white spaces. Byte values are represented by integers from 0 to 255, and if the read() method returns -1, this indicates the end of the stream.

import java.io.FileInputStream;
import java.io.IOException;
public class ReadingBytes {
public static void main(String[] args) {  FileInputStream myFile = null;  try {
  myFile = new FileInputStream("c:\\abc.dat");  // open the  stream
  boolean eof = false;
   while (!eof) {
    int byteValue = myFile.read();  // read  the stream
    System.out.println(byteValue);
    if (byteValue  == -1){
    eof = true;
    }
    //myFile.close();  // do not do it here!!!
   }
  }catch (IOException e) {
      System.out.println("Could not read file: " + e.toString());
  } finally{
   try{
     if (myFile!=null){
       myFile.close(); // close the stream
     }
   } catch (Exception e1){
    e1.printStackTrace();
   }
  }
 }
}

Please note that the stream is closed in the clause finally. Do not call the method close() inside of the try/catch block right after the file reading is done. In case of exception during the file read, the program would jump over the close() statement and the stream would never be closed!

The next code fragment writes into the file xyz.dat using the class FileOutputStream:

int somedata[]={56,230,123,43,11,37};
FileOutputStream myFile = null;
try {
  myFile = new  FileOutputStream("c:\xyz.dat");
  for (int i = 0; i <somedata.length;I++){
      myFile.write(data[I]);
  }
} catch (IOException e)
      System.out.println("Could not write to a file: " + e.toString()); }
finally
    // Close the file the same way as in example above }

Buffered Streams

So far we were reading and writing one byte at a time. Disk access is much slower than the processing performed in memory. That's why it's not a good idea to access disk 1000 times for reading a file of 1000 bytes. To minimize the number of time the disk is accessed, Java provides buffers, which are sort of "reservoirs of data". For example, the class BufferedInputStream works as a middleman between the FileInputStream and the file itself. It reads a big chunk of bytes from a file in one shot into memory, and, then the FileInputStream will read single bytes from there. The BufferedOutputStream works in a similar manner with the class FileOutputStream. Buffered streams just make reading more efficient.

You can use stream chaining (or stream piping) to connect streams - think of connecting two pipes in plumbing. Let's modify the example that reads the file abc.dat to introduce the buffering:

FileInputStream myFile = null;
BufferedInputStream buff =null
try {
  myFile = new  FileInputStream("abc.dat");
  BufferedInputStream buff = new BufferedInputStream(myFile);
  boolean eof = false;
  while (!eof) {
    int byteValue = buff.read();
    System.out.print(byteValue + " ");
    if (byteValue  == -1)
      eof = true;
  }
} catch (IOException e) {
  e.printStackTrace();
} finally{
  buff.close();
  myFile.close();
}

It's a good practice to call the method flush() when the writing into a BufferedOutputStream is done - this forces any buffered data to be written out to the underlying output stream.

While the default buffer size varies depending on the OS, it could be controlled. For example, to set the buffer size to 5000 bytes do this:

BufferedInputStream buff = new BufferedInputStream(myFile, 5000);

Character Streams

Java uses two-byte characters to represent text data, and the classes FileReader and FileWriter work with text files. These classes allow you to read files either one character at a time with read(), or one line at a time with readLine().

The classes FileReader and FileWriter also support have buffering with the help of BufferedReader and BufferedWriter. The following example reads text one line at a time:

FileReader myFile = null;
BufferedReader buff = null;
try {
  myFile = new FileReader("abc.txt");
  buff = new BufferedReader(myFile);
  boolean eof = false;
  while (!eof) {
    String line = buff.readLine();
    if (line == null)
      eof = true;
    else
      System.out.println(line);
    }
    ....
}

For the text output, there are several overloaded methods write() that allow you to write one character, one String or an array of characters at a time.

To append data to an existing file while writing, use the 2-arguments constructor (the second argument toggles the append mode):

FileWriter fOut = new FileWriter("xyz.txt", true);

Below is yet another version of the tax calculation program (see the lesson Intro to Object-Oriented Programming with Java). This is a Swing version of the program and it populates the populate the dropdown box chState with the data from the text file states.txt.

import java.awt.event.*;
import java.awt.*;
import java.io.FileReader;
import java.io.BufferedReader;
import java.io.IOException;

public class TaxFrameFile extends java.awt.Frame implements ActionListener {
  Label lblGrIncome;
  TextField txtGrossIncome = new TextField(15);
  Label lblDependents=new Label("Number of Dependents:");
  TextField txtDependents = new TextField(2);
  Label lblState = new Label("State: ");
  Choice chState = new Choice();

  Label lblTax = new Label("State Tax: ");
  TextField txtStateTax = new TextField(10);
  Button bGo = new Button("Go");
  Button bReset = new Button("Reset");

  TaxFrameFile() {
    lblGrIncome = new Label("Gross Income: ");
    GridLayout gr = new GridLayout(5,2,1,1);
    setLayout(gr);

    add(lblGrIncome);
    add(txtGrossIncome);
    add(lblDependents);
    add(txtDependents);
    add(lblState);
    add(chState);
    add(lblTax);
    add(txtStateTax);
    add(bGo);
    add(bReset);

    // Populate states from a file
    populateStates();
    txtStateTax.setEditable(false);

    bGo.addActionListener(this);
    bReset.addActionListener(this);

    // Define, instantiate and register a WindowAdapter
    // to process windowClosing Event of this frame

       this.addWindowListener(new WindowAdapter() {
        public void windowClosing(WindowEvent e) {
                           System.out.println("Good bye!");
            System.exit(0);
        }});
    }

    public void actionPerformed(ActionEvent evt) {
       Object source = evt.getSource();
        if (source == bGo ){
           // The Button Go processing
             try{
               int grossInc =
                Integer.parseInt(txtGrossIncome.getText());
               int dependents   =
                 Integer.parseInt(txtDependents.getText());
               String state = chState.getSelectedItem();

               Tax tax=new Tax(dependents,state,grossInc);
               String sTax =
                       Double.toString(tax.calcStateTax());
               txtStateTax.setText(sTax);
             }catch(NumberFormatException e){
                 txtStateTax.setText("Non-Numeric Data");
             }catch (Exception e){
                txtStateTax.setText(e.getMessage());
             }
         }
         else if (source == bReset ){
            // The Button Reset processing
        txtGrossIncome.setText("");
        txtDependents.setText("");
                          chState.select("  ");
        txtStateTax.setText("");
             }
    }
   // This method will read the file states.txt and  
   // populate the dropdown chStates
    private void populateStates(){
      FileReader myFile = null;
      BufferedReader buff = null;
        try {
            myFile = new FileReader("states.txt");
            buff = new BufferedReader(myFile);

            boolean eof = false;
            while (!eof) {
                String line = buff.readLine();
                if (line == null)
                   eof = true;
                else
                   chState.add(line);
         }
        }catch (IOException e){
           txtStateTax.setText("Can't read states.txt");
       }
       finally{
         // Closing the streams
         try{
            buff.close();
            myFile.close();
         }catch(IOException e){
            e.printStackTrace();
         }
       }
    }

    public static void main(String args[]){
       TaxFrameFile taxFrame = new TaxFrameFile();
       taxFrame.setSize(400,150);
       taxFrame.setVisible(true);
    }
}

Data Streams

If you are expecting to work with a stream of a known data structure, i.e. two integers, three floats and a double, use either the DataInputStream or the DataOutputStream. A method call readInt() will read the whole integer number (4 bytes ) at once, and the readLong() will get you a long number (8 bytes).

The DataInput stream is just a filter. We are building a "pipe" from the following fragments:

FileInputStream --> BufferedInputStream --> DataInputStream

FileInputStream myFile = new FileInputStream("myData.dat");
BufferedInputStream buff = new BufferedInputStream(myFile);
DataInputStream data = new  DataInputStream(buff);

try {
   int num1 = data.readInt();
   int num2 = data.readInt();
   float num2 = data.readFloat();
   float num3 = data.readFloat();
   float num4 = data.readFloat();
   double num5 = data.readDouble();
} catch (EOFException eof) {...}

Class StreamTokenizer

Sometimes you need to parse a stream without knowing in advance what data types you are getting. In this case you want to get each "piece of data" (token) based on the fact that a delimiter such as a space, comma, etc separates the data elements.

The class java.io.StreamTokenizer reads tokens one at a time. It can recognize identifiers, numbers, quoted strings, etc. Typically an application creates an instance of this class, sets up the rules for parsing, and then repeatedly calls the method nextToken() until it returns the value TT_EOF (end of file).

Let's write a program that will read and parse the file customers.txt distinguishing strings from numbers.

Suppose we have a file customers.txt with the following content:

John Smith  50.24
Mary Lou  234.29
Alexander Popandopula  456.11

Here is the program that parses it:

import java.io.StreamTokenizer;
import java.io.FileReader;

public class CustomerTokenizer{
  public static void main(String args[]){

  StreamTokenizer stream =null;
  try{
    stream = new StreamTokenizer( new
                            FileReader("customers.txt"));
    while (true) {

         int token = stream.nextToken();
         if (token == StreamTokenizer.TT_EOF)
             break;
         if (token == StreamTokenizer.TT_WORD) {
             System.out.println("Got the string: " +
                                         stream.sval);
             }
         if (token == StreamTokenizer.TT_NUMBER) {
             System.out.println("Got the number: " +
                                         stream.nval);
             }
        }
      }catch (Exception e){
         System.out.println("Can't read Customers.txt: " +
                                             e.toString());
      }
      finally{
          try{
             stream.close();
          }catch(Exception e){e.printStackTrace();}          
      }
   }
}

After compiling and running the program CustomerTokenizer, the system console will look like this:

javac CustomerTokenizer.java

java CustomerTokenizer

Got the string: John
Got the string: Smith
Got the number: 50.24
Got the string: Mary
Got the string: Lou
Got the number: 234.29
Got the string: Alexander
Got the string: Popandopula
Got the number: 456.11

When a StreamTokenizer finds a word, it places the value into the sval member variable, and the numbers are placed into the variable nval.

You can specify characters that should be treated as delimiters by calling the method whitespaceChars(). The characters that represent quotes in the stream are set by calling the method quoteChar().

To make sure that certain characters are not misinterpreted, call a method ordinaryChar(), for example ordinaryChar('/');

Class StringTokenizer

The class java.util.StringTokenizer is a simpler version of a class StreamTokenizer, but it works only with strings. The set of delimiters could be specified at the creation time, i.e. comma and angle brackets:

StringTokenizer st = new StringTokenizer(
              "Yakov, 12 Main St., New York", ",<>");
while (st.hasMoreTokens()) {
    System.out.println(st.nextToken());
}

The above code fragment would print the following:

HTML
Yakov
12 Main St.
New York

The previous sample would not return the value of a delimiter - it just returned the tokens. But sometimes, in case of multiple delimiters, you may want to know what's the current delimiter. The 3-argument constructor will provide this information. The following example defines 4 delimiters: greater and less then signs, comma and a white space:

StringTokenizer st=new StringTokenizer( "...IBM...price<...>86.3", "<>, ", true);

If the third argument is true, delimiter characters are also considered to be tokens and will be returned to the calling method, so a program may apply different logic based on the delimiter. If you decide to parse HTML file, you'll need to know what's the current delimiter to apply the proper logic.

Class File

This class has a number of useful file maintenance methods that allow rename, delete, perform existence check, etc. First you have to create an instance of this class:

File myFile = new File("abc.txt");

The line above does not actually create a file - it just creates an instance of the class File that is ready to perform some manipulations with the file abc.txt.The method createNewFile() should be used for the actual creation of the file.

Below are some useful methods of the class File:

   1. createNewFile()creates a new, empty file named according to the file name used during the File instantiation. It creates a new file only if a file with this name does not exist
   2. delete() deletes file or directory
   3. renameTo() renames a file
   4. length() returns the length of the file in bytes
   5. exists() tests whether the file with specified name exists
   6. list() returns an array of strings naming the files and directories in the specified directory
   7. lastModified() returns the time that the file was last modified
   8. mkDir() creates a directory

The code below creates a renames a file customers.txt to customers.txt.bak. If a file with such name already exists, it will be overwritten.

File file = new File("customers.txt");
File backup = new File("customers.txt.bak");
if (backup.exists()){
 backup.delete();
}
file.renameTo(backup);

More Stories By Yakov Fain

Yakov Fain is a Java Champion and a co-founder of the IT consultancy Farata Systems and the product company SuranceBay. He wrote a thousand blogs (http://yakovfain.com) and several books about software development. Yakov authored and co-authored such books as "Angular 2 Development with TypeScript", "Java 24-Hour Trainer", and "Enterprise Web Development". His Twitter tag is @yfain

Comments (13) View Comments

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


Most Recent Comments
Frederik De Milde 02/06/05 04:41:46 PM EST

A little bug in your code snippet on Buffering: buff shouldn't be declared twice...

FileInputStream myFile = null;
BufferedInputStream buff =null
try {
myFile = new FileInputStream("abc.dat");
BufferedInputStream buff = new BufferedInputStream(myFile);
boolean eof = false;
while (!eof) {
int byteValue = buff.read();
System.out.print(byteValue + " ");
if (byteValue == -1)
eof = true;
}
} catch (IOException e) {
e.printStackTrace();
} finally{
buff.close();
myFile.close();
}

Muiris O Cleirigh 03/29/04 04:26:47 AM EST

There seems to be a problem rendering this article - this part doesn''t make a lot of sense to me ......

The next code fragment writes into the file xyz.dat using the class FileOutputStream:

int somedata[]={56,230,123,43,11,37};
FileOutputStream myFile = null;
try {
myFile = new FileOutputStream("xyz.dat");
for (int i = 0; i
You can use stream chaining (or stream piping) to connect streams - think of connecting two pipes in plumbing.
Let''s modify the example that reads the file abc.dat to introduce the buffering:

FileInputStream myFile = null;
BufferedInputStream buff =null
try {
myFile = new FileInputStream("abc.dat");
BufferedInputStream buff = new BufferedInputStream(myFile);
boolean eof = false;
while (!eof) {
int byteValue = buff.read();
System.out.print(byteValue + " ");
if (byteValue == -1)
eof = true;
}
} catch (IOException e) {
e.printStackTrace();
} finally{
buff.close();
myFile.close();
}

Greg Nudelman 01/07/04 12:05:29 PM EST

Excellent article. This has got to be one of the best Stream tutorials out there. I''m also very impressed with your coverage of the StreamTokenizer -- this is an "esoteric" class that few people know and understand how to use effectively. I''m also very impressed with a nice concise run-down of the File class features, such as directory management. Again, excellent article. Keep up the good work!

Greg

michael turner 01/07/04 11:00:34 AM EST

This was a good summarization of Java Stream Basics. I have retained sections of the article for Reference. Thank you for presenting the information

Yakov Fain 01/07/04 10:48:56 AM EST

Sorry, the FileReader should have been closed:

import java.io.StreamTokenizer;
import java.io.FileReader;

public class CustomerTokenizer{
public static void main(String args[]){

StreamTokenizer stream =null;
FileReader fr=null;
try{
fr = new FileReader("customers.txt");
stream = new StreamTokenizer(fr );
while (true) {

int token = stream.nextToken();
if (token == StreamTokenizer.TT_EOF)
break;
if (token == StreamTokenizer.TT_WORD) {
System.out.println("Got the string: " +
stream.sval);
}
if (token == StreamTokenizer.TT_NUMBER) {
System.out.println("Got the number: " +
stream.nval);
}
}
}catch (Exception e){
System.out.println("Can''t read Customers.txt: " +
e.toString());
}
finally{
try{
fr.close();
}catch(Exception e){e.printStackTrace();}
}
}
}

thomas harris 01/07/04 09:50:54 AM EST

I tried to compile the CustomerTokenizer class and came up with the following error:

File: C:\java\test\CustomerTokenizer.java [line: 33]
Error: cannot resolve symbol
symbol : method close ()
location: class java.io.StreamTokenizer

I checked the api, and the close() method is available via java.io.FileReader so I am not clear on why I got the above error message. Is the compiler looking for the method in the wrong class?

Yakov Fain 01/07/04 09:19:31 AM EST

One of ther readers asked me the questions listed below, and I decided to publish the answers over here as well.

What is the reason for a buffered reader and buffered
writer--what is the need for buffering?

Buffering allows you to read a bunch of bytes from disk into the buffer in one shot, and then the program gets the data from the buffer from the memory byte by byte, int by int, etc. This minimizes the number of times your program has to access the disk.

StreamTokenizer does not have a close method. How do
you close a Stream Tokenizer?

The StreamTokenizer is not a stream itself - it does not need to be closed.

Andrey Postoyanets 01/06/04 05:22:20 PM EST

So far the style of this lessons series has been outstanding. Nothing can be more valuable than informative, easy-to-follow examples with a clear explanation.
I also agree that the usage of Java streams and XML manipulation would be a great topic for one of the future articles.

Matt Campbell 01/06/04 03:54:37 PM EST

LOL!

Steve Krenz 01/06/04 03:22:36 PM EST

What''s VB ;-)

Matt Campbell 01/06/04 02:31:39 PM EST

You know, it still isn''t nearly as easy as getting a stream of chars by line or a whole file for parsing as it is in VB. :)

Steven Krenz 01/06/04 11:27:03 AM EST

Good article as far as it goes but what would be much more helpful would be a discussion of how readers differ from and are built on streams. What I always find difficult to remember is the best way to read a whole line of data from a stream and not just characters. The connection between java.io streams and org.xml.sax.InputStream is what I suspect most developers are concerned about right now.

Warawich Sundarabhaka 01/05/04 10:50:38 PM EST

very good lesson and example.

Latest Stories
DXWorldEXPO LLC, the producer of the world's most influential technology conferences and trade shows has announced the 22nd International CloudEXPO | DXWorldEXPO "Early Bird Registration" is now open. Register for Full Conference "Gold Pass" ▸ Here (Expo Hall ▸ Here)
Excitement and interest in APIs has skyrocketed in recent years. However, if you ask a room full of IT professionals "What is an API", you will get a wide array of answers. There exists a wide knowledge gap between API experts and those that have a general idea of what they are, but are unsure of what they have been for in the past, what they look like now, and how they can be used to expand your business in the future. In this session John will cover what the history of APIs, what an API looks ...
@DevOpsSummit at Cloud Expo, taking place November 12-13 in New York City, NY, is co-located with 22nd international CloudEXPO | first international DXWorldEXPO and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time t...
DevOps with IBMz? You heard right. Maybe you're wondering what a developer can do to speed up the entire development cycle--coding, testing, source code management, and deployment-? In this session you will learn about how to integrate z application assets into a DevOps pipeline using familiar tools like Jenkins and UrbanCode Deploy, plus z/OSMF workflows, all of which can increase deployment speeds while simultaneously improving reliability. You will also learn how to provision mainframe syste...
CloudEXPO New York 2018, colocated with DXWorldEXPO New York 2018 will be held November 11-13, 2018, in New York City and will bring together Cloud Computing, FinTech and Blockchain, Digital Transformation, Big Data, Internet of Things, DevOps, AI, Machine Learning and WebRTC to one location.
The best way to leverage your Cloud Expo presence as a sponsor and exhibitor is to plan your news announcements around our events. The press covering Cloud Expo and @ThingsExpo will have access to these releases and will amplify your news announcements. More than two dozen Cloud companies either set deals at our shows or have announced their mergers and acquisitions at Cloud Expo. Product announcements during our show provide your company with the most reach through our targeted audiences.
Traditional on-premises data centers have long been the domain of modern data platforms like Apache Hadoop, meaning companies who build their business on public cloud were challenged to run Big Data processing and analytics at scale. But recent advancements in Hadoop performance, security, and most importantly cloud-native integrations, are giving organizations the ability to truly gain value from all their data. In his session at 19th Cloud Expo, David Tishgart, Director of Product Marketing ...
DevOpsSummit New York 2018, colocated with CloudEXPO | DXWorldEXPO New York 2018 will be held November 11-13, 2018, in New York City. Digital Transformation (DX) is a major focus with the introduction of DXWorldEXPO within the program. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term.
Machine Learning helps make complex systems more efficient. By applying advanced Machine Learning techniques such as Cognitive Fingerprinting, wind project operators can utilize these tools to learn from collected data, detect regular patterns, and optimize their own operations. In his session at 18th Cloud Expo, Stuart Gillen, Director of Business Development at SparkCognition, discussed how research has demonstrated the value of Machine Learning in delivering next generation analytics to impr...
Andi Mann, Chief Technology Advocate at Splunk, is an accomplished digital business executive with extensive global expertise as a strategist, technologist, innovator, marketer, and communicator. For over 30 years across five continents, he has built success with Fortune 500 corporations, vendors, governments, and as a leading research analyst and consultant.
The challenges of aggregating data from consumer-oriented devices, such as wearable technologies and smart thermostats, are fairly well-understood. However, there are a new set of challenges for IoT devices that generate megabytes or gigabytes of data per second. Certainly, the infrastructure will have to change, as those volumes of data will likely overwhelm the available bandwidth for aggregating the data into a central repository. Ochandarena discusses a whole new way to think about your next...
Extreme Computing is the ability to leverage highly performant infrastructure and software to accelerate Big Data, machine learning, HPC, and Enterprise applications. High IOPS Storage, low-latency networks, in-memory databases, GPUs and other parallel accelerators are being used to achieve faster results and help businesses make better decisions. In his session at 18th Cloud Expo, Michael O'Neill, Strategic Business Development at NVIDIA, focused on some of the unique ways extreme computing is...
"We do one of the best file systems in the world. We learned how to deal with Big Data many years ago and we implemented this knowledge into our software," explained Jakub Ratajczak, Business Development Manager at MooseFS, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
CI/CD is conceptually straightforward, yet often technically intricate to implement since it requires time and opportunities to develop intimate understanding on not only DevOps processes and operations, but likely product integrations with multiple platforms. This session intends to bridge the gap by offering an intense learning experience while witnessing the processes and operations to build from zero to a simple, yet functional CI/CD pipeline integrated with Jenkins, Github, Docker and Azure...
Today, we have more data to manage than ever. We also have better algorithms that help us access our data faster. Cloud is the driving force behind many of the data warehouse advancements we have enjoyed in recent years. But what are the best practices for storing data in the cloud for machine learning and data science applications?