Scraping

  • Download a web page an all its requisites:

    $ wget -r -p -nc -nH --level=1 https://pypi.python.org/simple/python-ldap/
    
  • Check the local SOCKS proxy started by a Tor Browser is working:

    $ curl --preproxy 127.0.0.1:9150 "https://check.torproject.org"
    
  • Reuse local Tor Browser proxy to download a video:

    $ yt-dlp --proxy socks5://127.0.0.1:9150 "https://www.video-provider.com/watch/random_id"
    
  • Create a PNG image of a rendered html page:

    $ kwebdesktop 1024 768 capture.png https://slashdot.org/
    

Servers

  • Test that your site is sending gzipped content:

    $ curl -i -H "Accept-Encoding: gzip,deflate" https://kevin.deldycke.com 2>&1 | grep gzip
    
  • Ping some pages on internet to force our corporate proxy to refresh its internal cache:

    $ for EGG in BeautifulSoup PIL Plone; do wget --server-response -O /dev/null https://pypi.python.org/simple/$EGG/; done
    
  • Debug mysterious numbers (source):

    $ echo 'obase=16; 1195725856' | bc | xxd -r -ps | od -cb
    0000000   G   E   T
            107 105 124 040
    0000004
    

Certificates

  • Create a minimal self-signed unencrypted SSL certificate without issuer information and a validity period of 10 years:

    $ openssl req -x509 -nodes -subj '/' -days 3650 -newkey rsa:2048 -keyout self-signed.pem -out self-signed.pem
    
  • Create a pair of SSL self-signed certificate and (unencrypted) private key (source):

    $ openssl genrsa -out private.key 2048
    $ openssl req -new -subj '/' -key private.key -out certreq.csr
    $ openssl x509 -req -days 3650 -in certreq.csr -signkey private.key -out self-signed.pem
    $ rm certreq.csr
    
  • View certificate details:

    $ openssl x509 -noout -text -in self-signed.pem
    
  • Fetch from a website its first certificate of the chain:

    $ openssl s_client -connect imap.gmail.com:993 -showcerts 2>&1 < /dev/null | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | sed -ne '1,/-END CERTIFICATE-/p' > ~/gmail.pem
    
  • Fetch the certificate from a website (the one returned is the last of the chain):

    $ openssl s_client -connect imap.gmail.com:993 -showcerts 2>&1 < /dev/null | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | tac | sed -ne '1,/-BEGIN CERTIFICATE-/p' | tac > ./google.pem
    

MIME type

Markup

  • Search non-breakable spaces that doesn’t end with a semicolon:
    $ grep -RIi --extended-regexp '&nbsp[^;]' ./