How do I block ads on YouTube?

blacklisting

#134

Update
Well with the hackertarget api there is no need for dnsdumpster so temp files will be saved in pihole dir from now:

uninstall dnsdumpster, python python-pip, or let it as it is.

important part:
Update the /etc/pihole/youtube-ads.sh file to the following:


echo off
rm /etc/pihole/youtube-filtered.txt
rm /etc/pihole/youtube-ads.txt
curl -s “https://api.hackertarget.com/hostsearch/?q=googlevideo.com” | awk -F, ‘NR>1 {print $1}’ | sudo tee /etc/pihole/youtube-filtered.txt > /dev/null
sed 's/\s.$//’ /etc/pihole/youtube-filtered.txt >> /etc/pihole/youtube-ads.txt
cat /etc/pihole/youtube-ads.txt > /var/www/html/youtube-ads-list.txt
#greps the log for youtube ads and appends to /var/www/html/youtube-ads-list.txt
grep r
.googlevideo.com /var/log/pihole.log | awk ‘{print $6}’| grep -v ‘^googlevideo.com|redirector’ | sort -nr | uniq >> /var/www/html/youtube-ads-list.txt
#removes duplicate lines from /var/www/html/youtube-ads-list.txt
perl -i -ne ‘print if ! $x{$_}++’ /var/www/html/youtube-ads-list.txt
#updates pihole blacklist/whitelist
pihole -g

flush your logs from within the pihole admin interface.
wait or manualy run /etc/pihole/youtube-ads.sh (2x)

thanks ppl good luck


#136

The page https://api.hackertarget.com/hostsearch/?q=googlevideo.com somehow did generate some of the
fingerprints of googlevideo.com but didn’t have the rxxxsnxxxx.googlevideo.com that my phone received so all the ads just slipped through. Can you help me with this ?


#137

had to change the collumn AWK selects to filter the “r5—sn-5hnednlr.googlevideo.com” look-alike lines from the pihole log. Collumn 6 seems to feed my device ipaddresses to the youtube-ads-list.txt instead.

from:
“grep r.googlevideo.com /var/log/pihole.log | awk ‘{print $6}’| grep -v ‘^googlevideo.com|redirector’ | sort -nr | uniq >> /var/www/html/youtube-ads-list.txt” to
to:
"grep r.googlevideo.com /var/log/pihole.log | awk ‘{print $8}’| grep -v ‘^googlevideo.com|redirector’ | sort -nr | uniq >> /var/www/html/youtube-ads-list.txt


#138

Based on the info in this thread I’ve put together a simple script. I don’t use the YouTube app that often, so I’m not 100% this works perfectly. It also appends the urls with the r00—sn-xxxxxx.googlevideos.com pattern, which are not in the hackertarget list.

Place it somewhere convenient, add execute permissions (chmod +x filter-youtube-domains.sh) and add it to your cron jobs.

Edit:
I have noticed some ads still (although less than I used to). I’m not sure yet if the domains provided by hackertarget are incomplete or that I need to run my cron job more often (now it runs every 24h). I did find a longer list on Wolfram Alpha (click on subdomains) but they don’t seem to have an easy way to get those in plain text.


#139

see my script below, it fixes those urls too :slight_smile:


#140
curl 'http://api.wolframalpha.com/v2/query?input=googlevideo.com&appid=$APPID$&format=plaintext&podstate=WebSiteStatisticsPod:InternetData__Subdomains&podstate=WebSiteStatisticsPod:InternetData__Subdomains_More' | grep -Po "r\d+---sn-.+.googlevideo.com" >> $FILE

To get more subdomains from Wolfram, add this after line 16
Replace $APPID$ with your AppId from Wolfram https://developer.wolframalpha.com/portal/signin.html


#141

I think you missed a \ after the curl, also I used ’ insted of ’ dont know if that matters though.

curl 'http://api.wolframalpha.com/v2/query?input=googlevideo.com&appid=$APPID$&format=plaintext&podstate=WebSiteStatisticsPod:InternetData__Subdomains&podstate=WebSiteStatisticsPod:InternetData__Subdomains_More' \

| grep -Eo “r\d±–sn-.+.googlevideo.com” >> $FILE

#142

this does not send anything to the file.

(i do have a Wolfram ID for the app. The curl command does pull information)


#143
| grep -Eo "r\d+---sn-.+.googlevideo.com" >> $FILE

copy pasting removed the 3 ‘-’ after the ‘+’


#144

sudo curl “http://api.wolframalpha.com/v2/query?input=googlevideo.com&appid=MYAPPID&format=plaintext&podstate=WebSiteStatisticsPod:InternetData__Subdomains&podstate=WebSiteStatisticsPod:InternetData__Subdomains_More 5” | sudo grep -Eo “r\d±–sn-.+.googlevideo.com” >> /tmp/bla.txt

bla.txt is blank

(I do have the corrected grep command - seems to auto truncate when pasting here)


#145

Switch grep -Eo with grep -Po … -E flag doesn’t work in raspbian for me for -P does


#146

That did the trick. Thank you.


#147

nice, thanks


#148

Hi Guys,

I have Pi-hole in an Ubuntu Server. I am trying this script. I added the lines:

sudo curl 'http://api.wolframalpha.com/v2/query?input=googlevideo.com&appid=R#######8&format=plaintext&podstate=WebSiteStatisticsPod:InternetData__Subdomains&podstate=WebSiteStatisticsPod:InternetData__Subdomains_More'

| sudo grep -Eo “r\d±–sn-.+.googlevideo.com” >> /etc/pihole/youtube

I get this error in command line:

</queryresult>youtube.sh: line 21: syntax error near unexpected token `|' youtube.sh: line 21: `| sudo grep -Eo “r\d±–sn-.+.googlevideo.com” >> etc/pihole/youtube
hernando@pihole:~$  syntax error near unexpected token `|'

There is a file in /etc/pihole.youtube.hosts

It contains lines like this:

r5.snoguesnzz.googlevideo.com
r4---sn-nx5e6n76.googlevideo.com
r3---sn-nx57yn76.googlevideo.com
r9---sn-n8v7zn76.googlevideo.com

Can someone please help me.

If this means it’s working, how can I check in the Pi-hole web GUI that these are being blocked?

Thank you. I love my Pi-Hole.


#149

Hey, I had the same problem but following Sergeant_Salz’s advice, adding a \ to the end of the first line fixed it for me.

curl ‘http://api.wolframalpha.com/v2/query?input=googlevideo.com&appid=$APPID$&format=plaintext&podstate=WebSiteStatisticsPod:InternetData__Subdomains&podstate=WebSiteStatisticsPod:InternetData__Subdomains_More’ \

Not sure how to test if the script worked however.


#150

Thanks a lot to @ErikFontanel for the script!

However, I had to do some modifications, some of then suggested in this tread, to make it work, at least partially.

This is what I did:

#!/bin/sh
# This script will fetch the Googlevideo ad domains and append them to the Pi-hole block list.
# Run this script daily with a cron job (don't forget to chmod +x)
# More info here: https://discourse.pi-hole.net/t/how-do-i-block-ads-on-youtube/253/136

# File to store the YT ad domains
FILE=/etc/pihole/youtube.hosts

# Wolfram Alfa AppID
APPID=Your-AppID

# Fetch the list of domains, remove the ip's and save them
curl 'https://api.hackertarget.com/hostsearch/?q=googlevideo.com' \
| awk -F, 'NR>1{print $1}' \
| grep -vE "redirector|manifest" > $FILE

# Replace r*.sn*.googlevideo.com URLs to r*---sn-*.googlevideo.com
# and add those to the list too
curl "http://api.wolframalpha.com/v2/query?input=googlevideo.com&appid=${APPID}&format=plaintext&podstate=WebSiteStatisticsPod:InternetData__Subdomains&podstate=WebSiteStatisticsPod:InternetData__Subdomains_More" \
| grep -Po "r\d+---sn-.+.googlevideo.com" >> $FILE

# Scan log file for previously accessed domains
grep "^r*.googlevideo\.com" /var/log/pihole*.log \
| awk '{print $8}' \
| grep -vE "redirector|manifest" \
| sort | uniq >> $FILE

# Add to Pi-hole adlists if it's not there already
if ! grep $FILE < /etc/pihole/adlists.list; then echo "file://$FILE" >> /etc/pihole/adlists.list; fi; 
  • I have to wrap wolframalpha’s url in double quotes " because simple quoting ' broke the url and make it dysfunctional. It had something to do with the & characters in the middle of the url.
  • As others suggested I changed grep -Eo "r\d±–sn-.+.googlevideo.com" to grep -Po "r\d+---sn-.+.googlevideo.com"
  • I added the variable APPID so it’s easier to add they wolframalpha’s AppID
  • Till there everything OK and seems that the $FILEgets more populated in each step. However, I can make the last step work. I mean If I run it on the CLI like below, test.txt is empty.
grep "^r*.googlevideo\.com" /var/log/pihole*.log \
| awk '{print $8}' \
| grep -vE "redirector|manifest" \
| sort | uniq >> test.txt
  • So I guess the last part is not collecting urls from the log. Besides, I don’t see any of the ones that are not blocked in my log in the final file.

On top of all of that I still see youtube ads and I’ve notice that there are other kind of .googlevideo.com subdomains with a ixhs suffix. I don’t know if this mean something.

r9---sn-uxap5nvoxg5-ixhs.googlevideo.com
r5---sn-uxap5nvoxg5-j2is.googlevideo.com
r2---sn-uxap5nvoxg5-ixhs.googlevideo.com
r4---sn-aigzrn7z.googlevideo.com
r12---sn-uxap5nvoxg5-ixhs.googlevideo.com
r16---sn-uxap5nvoxg5-ixhs.googlevideo.com
r4---sn-5go7yn7l.googlevideo.com
r1---sn-5goeen76.googlevideo.com
r8---sn-uxap5nvoxg5-ixhs.googlevideo.com
r1---sn-a5mlrnel.googlevideo.com
r6---sn-5hne6nlr.googlevideo.com
r6---sn-uxap5nvoxg5-ixhs.googlevideo.com
r7---sn-uxap5nvoxg5-ixhs.googlevideo.com
r5---sn-5ualdn7z.googlevideo.com
r9---sn-uxap5nvoxg5-ixhs.googlevideo.com
r10---sn-uxap5nvoxg5-ixhs.googlevideo.com
r3---sn-uxap5nvoxg5-ixhs.googlevideo.com
r5---sn-i5heen7l.googlevideo.com
r14---sn-uxap5nvoxg5-ixhs.googlevideo.com
r10---sn-uxap5nvoxg5-ixhs.googlevideo.com
r16---sn-uxap5nvoxg5-ixhs.googlevideo.com
r4---sn-uxap5nvoxg5-ixhs.googlevideo.com
r9---sn-uxap5nvoxg5-ixhs.googlevideo.com
r2---sn-uxap5nvoxg5-ixhs.googlevideo.com
r3---sn-ab5sznlk.googlevideo.com
r6---sn-uxap5nvoxg5-ixhs.googlevideo.com
r7---sn-uxap5nvoxg5-ixhs.googlevideo.com
r11---sn-uxap5nvoxg5-ixhs.googlevideo.com
r4---sn-uxap5nvoxg5-ixhs.googlevideo.com
r10---sn-uxap5nvoxg5-ixhs.googlevideo.com
r11---sn-uxap5nvoxg5-j2is.googlevideo.com
r9---sn-uxap5nvoxg5-ixhs.googlevideo.com
r3---sn-5hne6n7s.googlevideo.com
r5---sn-uxap5nvoxg5-ixhs.googlevideo.com
r5---sn-5go7yne6.googlevideo.com
r12---sn-uxap5nvoxg5-ixhs.googlevideo.com
r10---sn-uxap5nvoxg5-ixhs.googlevideo.com
r5---sn-ab5szn76.googlevideo.com
r6---sn-vgqs7nlz.googlevideo.com
r11---sn-uxap5nvoxg5-ixhs.googlevideo.com
r1---sn-hp57ynel.googlevideo.com
r6---sn-uxap5nvoxg5-ixhs.googlevideo.com
r12---sn-uxap5nvoxg5-ixhs.googlevideo.com
r1---sn-vgqskn7l.googlevideo.com
r7---sn-uxap5nvoxg5-ixhs.googlevideo.com
r4---sn-uxap5nvoxg5-ixhs.googlevideo.com
r8---sn-uxap5nvoxg5-5goe.googlevideo.com
r1---sn-5go7yn7e.googlevideo.com

But I can’t see any of those in my /etc/pihole/youtube.hosts file

Ideas?


#151

Try

grep "r*\.googlevideo\.com"

#152

Now… it works perfectly!

Thanks!


#153

I’m leave here the final script that worked for me after @Chipster suggestion.

#!/bin/sh
# This script will fetch the Googlevideo ad domains and append them to the Pi-hole block list.
# Run this script daily with a cron job (don't forget to chmod +x)
# More info here: https://discourse.pi-hole.net/t/how-do-i-block-ads-on-youtube/253/136

# File to store the YT ad domains
FILE=/etc/pihole/youtube.hosts

# Wolfram Alfa AppID
APPID=Your-AppID

# Fetch the list of domains, remove the ip's and save them
curl 'https://api.hackertarget.com/hostsearch/?q=googlevideo.com' \
| awk -F, 'NR>1{print $1}' \
| grep -vE "redirector|manifest" > $FILE

# Replace r*.sn*.googlevideo.com URLs to r*---sn-*.googlevideo.com
# and add those to the list too
curl "http://api.wolframalpha.com/v2/query?input=googlevideo.com&appid=${APPID}&format=plaintext&podstate=WebSiteStatisticsPod:InternetData__Subdomains&podstate=WebSiteStatisticsPod:InternetData__Subdomains_More" \
| grep -Po "r\d+---sn-.+.googlevideo.com" >> $FILE

# Scan log file for previously accessed domains
grep "r*\.googlevideo\.com" /var/log/pihole*.log \
| awk '{print $8}' \
| grep -vE "redirector|manifest" \
| sort | uniq >> $FILE

# Add to Pi-hole adlists if it's not there already
if ! grep $FILE < /etc/pihole/adlists.list; then echo "file://$FILE" >> /etc/pihole/adlists.list; fi; 

#154

I don’t know but I have the feeling that this is going to end blocking the whole youtube.
All the googlevideo.com queries are r*.googlevideo.com. So sometimes I have to reload if I want to watch a video so it changes the domain (or something like that).