Adam Caudill

Security Leader, Researcher, Developer, Writer, & Photographer

PageSource

This post was imported from an old blog archive, and predates the creation of AdamCaudill.com.

Here is a quick little function to retrieve the source code of a web page, it’s pure API, so no Internet Transfer control required.

API: #

Private Declare Function InternetOpen Lib "wininet.dll" Alias "InternetOpenA" ( _
    ByVal sAgent As String, _
    ByVal lAccessType As Long, _
    ByVal sProxyName As String, _
    ByVal sProxyBypass As String, _
    ByVal lFlags As Long) As Long
Private Declare Function InternetOpenUrl Lib "wininet.dll" Alias "InternetOpenUrlA" ( _
    ByVal hInternetSession As Long, _
    ByVal sURL As String, _
    ByVal sHeaders As String, _
    ByVal lHeadersLength As Long, _
    ByVal lFlags As Long, _
    ByVal lContext As Long) As Long
Private Declare Function InternetReadFile Lib "wininet.dll" ( _
    ByVal hFile As Long, _
    ByVal sBuffer As String, _
    ByVal lNumBytesToRead As Long, _
    lNumberOfBytesRead As Long) As Integer
Private Declare Function InternetCloseHandle Lib "wininet.dll" ( _
    ByVal hInet As Long) As Integer

Constants: #

Private Const IF_NO_CACHE_WRITE = &H4000000
Private Const BUFFER_LEN = 256

Function: #

Public Function PageSource(ByVal sURL As String, Optional ByVal strHeaders As String = "") As String
Dim sBuffer As String * BUFFER_LEN, iResult As Integer, sData As String
Dim hInternet As Long, hSession As Long, lReturn As Long
Dim lngHeaderLen As Long

    lngHeaderLen = Len(strHeaders)
    'get the handle of the current internet connection
    hSession = InternetOpen("User-Agent: Your-User-Agant-Here", 1, vbNullString, vbNullString, 0)
    'get the handle of the url
    If hSession Then hInternet = InternetOpenUrl(hSession, sURL, strHeaders, lngHeaderLen, IF_NO_CACHE_WRITE, 0)
    'if we have the handle, then start reading the web page
    If hInternet Then
        'get the first chunk & buffer it.
        iResult = InternetReadFile(hInternet, sBuffer, BUFFER_LEN, lReturn)
        sData = sBuffer
        'if there's more data then keep reading it into the buffer
        Do While lReturn <> 0
            iResult = InternetReadFile(hInternet, sBuffer, BUFFER_LEN, lReturn)
            sData = sData + Mid$(sBuffer, 1, lReturn)
            DoEvents
        Loop
    End If
    'close the URL
    iResult = InternetCloseHandle(hInternet)
    PageSource = sData
End Function

Simply pass the URL you want plus any extra headers that should be sent (one header per-line, each terminated by a vbCrLf) and it returns the full source of the page (without headers).

Adam Caudill