Java IO - Reading Content from URL

June 11, 2020 No comments Java IO Read URL

1. Introduction

In this article, we are going to present ways to read content directly from URL in Java. We will use classes available in plain Java like BufferedReader, Scanner, InputStream, and external libraries such as Guava or Apache Commons IO.

This article is a part of Java I/O Series.

2. Reading Directly from a URL using BufferedReader

Let's start with a simple solution in plain Java. In this example we make the use of InputStreamReader that is a bridge from byte streams to character streams. We are using this class to convert InputStream available under URL to a character-based stream. For better performance, we wrapped InputStreamReader with BufferedReader that uses buffering for efficient reading of characters, arrays, and lines.

package com.frontbackend.java.io.url;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.URL;

public class ReadURLUsingBufferedReader {

    public static void main(String[] args) throws IOException {
        String line;
        StringBuffer buff = new StringBuffer();

        URL url = new URL("http://www.example.com/");

        try (BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()))) {
            while ((line = in.readLine()) != null) {
                buff.append(line)
                    .append(System.lineSeparator());
            }
        }

        System.out.println(buff.toString());
    }
}

In this example, we are reading line by line from the URL and append these Strings into StringBuffer using platform-dependent line separator - System.lineSeparator().

3. Reading content from URL with Scanner

In the next example, we used Scanner class that can parse primitive types and strings using regular expressions.

package com.frontbackend.java.io.url;

import java.io.IOException;
import java.net.URL;
import java.util.Scanner;

public class ReadURLUsingScanner {

    public static void main(String[] args) throws IOException {
        URL url = new URL("http://www.example.com/");

        String content;
        try (Scanner scanner = new Scanner(url.openStream(), "UTF-8")) {
            content = scanner.useDelimiter("\\A")
                             .next();
        }
        System.out.println(content);
    }
}

In this example, we used Scanner with \\A delimiter that matches the beginning of the string. Then invoking next() method returns all characters from beginning to the end of the stream.

4. Get URL content using Java 9 InputStream

In Java 9 there is a nice method that reads all bytes from bytes streams. We can make use it in the following example:

package com.frontbackend.java.io.url;

import java.io.IOException;
import java.io.InputStream;
import java.net.URL;
import java.nio.charset.StandardCharsets;

public class ReadURLUsingInputStream {

    public static void main(String[] args) throws IOException {
        URL url = new URL("http://www.google.com/");

        try (InputStream inputStream = url.openStream()) {
            byte[] bytes = inputStream.readAllBytes();
            System.out.println(new String(bytes, StandardCharsets.UTF_8));
        }
    }
}

Note that encoding should be always provided on conversions from bytes to characters.

5. Read URL using Guava

Guava library provides Resources.toString(...) method that allows us to read all content from URL into a String.

package com.frontbackend.java.io.url;

import java.io.IOException;
import java.net.URL;
import java.nio.charset.StandardCharsets;

import com.google.common.io.Resources;

public class ReadURLUsingGuava {

    public static void main(String[] args) throws IOException {
        URL url = new URL("http://www.google.com/");

        String str = Resources.toString(url, StandardCharsets.UTF_8);
        System.out.println(str);
    }
}

6. Reading URL content using Apache Commons IO library

Apache Commons IO library comes with IOUtils class that can be used to convert InputStream from URL to String.

package com.frontbackend.java.io.url;

import java.io.IOException;
import java.io.InputStream;
import java.net.URL;
import java.nio.charset.StandardCharsets;

import org.apache.commons.io.IOUtils;

public class ReadURLUsingApacheCommonsIO {

    public static void main(String[] args) throws IOException {
        URL url = new URL("http://www.google.com/");

        try (InputStream in = url.openStream()) {
            String str = IOUtils.toString(in, StandardCharsets.UTF_8);
            System.out.println(str);
        }
    }
}

7. Conclusion

In this article, we presented several ways to read content from the URL in Java. We used classes available in plain Java and libraries such as Guava and Apache Commons IO. Luckily URL object contains method openStream() that returns InputStream. Reading URL actually comes to converting InputStream to a String.

Examples used in this tutorial are available under our GitHub repository.

{{ message }}

{{ 'Comments are closed.' | trans }}