unzip function for external ftp

Comments

4 comments

  • Comment actions Permalink
    Official comment
    Avatar
    Martin Lutzemann

    Hi Dirk,

    the procedure has no implementation for data sources other than the local file system. I took the opportunity and we will implement a very generic version of uncompressing files from any file-based data source in the near future.

    In the meantime, you can use an auxiliary procedure you find below. It works similar to UTILS.unzip() but supports also S3, SFTP and so forth. You just call it with 'exec views.unzip(..)' like the UTILS procedure. Please note that the last parameter references a folder on the file system local to DataVirtuality Server.

     

    CREATE VIRTUAL PROCEDURE views.unzip(IN file_datasource string, IN file_name string , IN output_folder string)
    AS
    BEGIN
    EXECUTE IMMEDIATE '
    select ''done'' from ( call "' || file_datasource || '.getFiles"(
    "pathAndPattern" => '''|| file_name || '''
    ))x, OBJECTTABLE(language ''javascript''
    ''importPackage(java.io);
    var zip = new java.util.zip.ZipInputStream(data.getBinaryStream());
    var ze = null;
    var outputFolder = new File( output_folder );
    if(!outputFolder.exists()){
    outputFolder.mkdirs();
    }
    var outputFolderName = output_folder;
    while ( (ze = zip.getNextEntry()) != null ) {
    var entryFileName = ze.getName();
    var newSubFile = new File(outputFolderName + java.lang.System.getProperty("file.separator")+ entryFileName);
    if(ze.isDirectory()) {
    var newSubDir = new File(newSubFile.getAbsolutePath());
    if(!newSubDir.exists()) {
    newSubDir.mkdirs();
    }
    } else {
    var fos = new java.io.FileOutputStream(newSubFile);
    var buffer = new java.lang.reflect.Array.newInstance(java.lang.Byte.TYPE, 65536);
    var len;
    while ((len = zip.read(buffer)) > 0) {
    fos.write(buffer, 0, len);
    }
    fos.close();
    }
    }
    zip.closeEntry();
    zip.close();
    ''
    PASSING x.file AS data, output_folder AS output_folder
    COLUMNS "result" blob ''dv_row''
    )AS y;';
    END;;

     

    I hope this is good solution for you.

    Best,
    Martin

  • 0
    Comment actions Permalink
    Avatar
    Martin Lutzemann

    Hey Dirk,

     

    I am already in the process of checking the procedure and other methods available in order to reach you goal.
    Will keep you updated.

     

    Best,
    Martin

  • 0
    Comment actions Permalink
    Avatar
    Jonas Seemann

    Hi Martin,

    since this topic became relevant to us we'd like know if this is possible by now?

    We have gzip-files on a sftp and want to unzip these files on the sftp as well.

    Thanks and BR,

    Jonas

  • 0
    Comment actions Permalink
    Avatar
    Martin Lutzemann

    Hi Jonas,

     

    there is a functionality which allows on-the-fly extracting of 'simple' archived.

    By simple I mean that the archive must not contain more than one file inside and only Zip and GZip archives are supported.

    In order to make use of this capability, you need to add an advanced property to the data source (file, SCP and FTP understand this setting).

    Edit the data source -> click on Advanced -> click on show advanced connection parameters -> enter decompressCompressedFiles=true as additional parameter

     

    You can now read the contents of the compressed file via our SQL dialect.
    For example, the script below reads Example.json.gz, decompresses it and displays its content after converting the BLOB to a UTF-8 encoded CLOB:

    select to_chars(f.file, 'utf-8')
    from (exec "ds_file.getFiles"(
            "pathAndPattern" => 'Example.json.gz'
             )) as f;;

     

    If you want to save the extracted file then you have to combine this call with the saveFile() procedure of your data source.
    Like so:

    exec "ds_file.saveFile"(
       "filePath" => 'extractedFile.json',
       "file" => (select to_chars(f.file, 'utf-8)
                    from (exec "ds_file.getFiles"(
                   "pathAndPattern" => 'Example.json.gz'
                   )) as f)
    );;

     

    Please not that you cannot get the name of the file inside and you have to come up with the name of the file to be stored by yourself.

     

    I hope this helps.

     

    Best,

Please sign in to leave a comment.

Powered by Zendesk