Page 1 of 1

Mail hang problem

Posted: 30 Mar 2013, 11:27
by redw0001
I've been running a B2 for a few years and switched to B3 in December. Throughout all this my mail has been operating fine, until a couple of weeks ago. Since then it has apparently just stopped receiving mail from my ISP. I checked on the ISP and can see the mail there.

I can sort the problem by making a change to the configuration. I just change something simple such as the setting for a user like 'Leave Copy'. (Mail---Retrieve Mail---edit a user---->tick or untick leave mail on server).
The change forces a restart of the fetchmail daemon as /etc/fetchmailrc has been changed.

Has anybody seen this behaviour or can suggest how I can sort this?

These are the reoccuring msgs I see in the maillog until the restart:
Mar 30 15:14:18 petra fetchmail[1291]: getaddrinfo("mail.plus.net","pop3") error: Name or service not known
Mar 30 15:14:18 petra fetchmail[1291]: POP3 connection to mail.plus.net failed: Connection timed out
Mar 30 15:14:18 petra fetchmail[1291]: Query status=2 (SOCKET)
Mar 30 15:14:38 petra fetchmail[1291]: getaddrinfo("mail.plus.net","pop3") error: Name or service not known
Mar 30 15:14:38 petra fetchmail[1291]: POP3 connection to mail.plus.net failed: Connection timed out
Mar 30 15:14:38 petra fetchmail[1291]: Query status=2 (SOCKET)

Here are messages from syslog around time of me making the change that forces a restart of the daemon.

Mar 30 15:14:38 petra fetchmail[1291]: getaddrinfo("mail.plus.net","pop3") error: Name or service not known
Mar 30 15:14:38 petra fetchmail[1291]: POP3 connection to mail.plus.net failed: Connection timed out
Mar 30 15:14:38 petra fetchmail[1291]: Query status=2 (SOCKET)
Mar 30 15:15:01 petra /USR/SBIN/CRON[24811]: (root) CMD (/etc/init.d/dovecot status >/dev/null 2>&1 || /etc/init.d/dovecot restart)
Mar 30 15:15:01 petra /USR/SBIN/CRON[24812]: (root) CMD (test -x /usr/bin/php && /usr/bin/php /usr/share/horde3/scripts/alarms.php)
Mar 30 15:15:01 petra /USR/SBIN/CRON[24813]: (root) CMD (test -x /usr/lib/web-admin/notify-dispatcher.pl && /usr/lib/web-admin/notify-dispatcher.pl)
Mar 30 15:15:04 petra bubba-networkmanager: Starting up
Mar 30 15:16:04 petra bubba-networkmanager: Server timed out, terminating
Mar 30 15:16:04 petra bubba-networkmanager: Daemon terminating
Mar 30 15:16:04 petra bubba-networkmanager: Shutting down
Mar 30 15:17:01 petra /USR/SBIN/CRON[24879]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Mar 30 15:19:38 petra fetchmail[1291]: restarting fetchmail (/etc/fetchmailrc changed)
Mar 30 15:19:38 petra fetchmail[1291]: starting fetchmail 6.3.17 daemon
Mar 30 15:19:49 petra fetchmail[1291]: 212 messages (174 seen) for jingle+robin at mail.plus.net (6628482 octets).

Re: Mail hang problem

Posted: 30 Mar 2013, 12:24
by nobody
Your problem is maybe in dns, as the mail.plus.net server is up and running, but your fetchmail cannot find it

* ping mail.plus.net from he b3
* replace the mailservername by the ip address

Letme know if this helps

Re: Mail hang problem

Posted: 31 Mar 2013, 08:45
by redw0001
Thanks nobody,
I've made the change on one of my userids, left the others as are. Now it is just a waiting game to see if any stop.

Re: Mail hang problem

Posted: 01 Apr 2013, 05:50
by redw0001
Sadly that change does not appear to have worked :(

The Id I changed and the one I left as previous both exhibiting same problem this morning.

Pings from my desktop (wolverine --- Linux Mint Maya) and b3 (petra ---- supplied firmware leve)
robin@wolverine ~ $ ping mail.plus.net
PING mail.plus.net (212.159.9.81) 56(84) bytes of data.
64 bytes from mail.plus.net (212.159.9.81): icmp_req=1 ttl=246 time=73.6 ms
64 bytes from mail.plus.net (212.159.9.81): icmp_req=2 ttl=246 time=41.9 ms
64 bytes from mail.plus.net (212.159.9.81): icmp_req=3 ttl=246 time=54.3 ms
^C
--- mail.plus.net ping statistics ---
4 packets transmitted, 3 received, 25% packet loss, time 3003ms
rtt min/avg/max/mdev = 41.984/56.662/73.628/13.020 ms
robin@wolverine ~ $ ssh petra
Linux petra 2.6.39.4-11 #1 Tue Apr 3 21:45:12 FET 2012 armv5tel

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Sun Mar 31 13:41:47 2013 from 192.168.1.105
robin@petra:~$ ping mail.plus.net
PING mail.plus.net (212.159.9.81) 56(84) bytes of data.
64 bytes from mail.plus.net (212.159.9.81): icmp_req=1 ttl=246 time=40.7 ms
64 bytes from mail.plus.net (212.159.9.81): icmp_req=2 ttl=246 time=39.3 ms
64 bytes from mail.plus.net (212.159.9.81): icmp_req=3 ttl=246 time=39.1 ms
64 bytes from mail.plus.net (212.159.9.81): icmp_req=4 ttl=246 time=42.3 ms
64 bytes from mail.plus.net (212.159.9.81): icmp_req=5 ttl=246 time=39.8 ms
^C64 bytes from mail.plus.net (212.159.9.81): icmp_req=6 ttl=246 time=41.3 ms

--- mail.plus.net ping statistics ---
6 packets transmitted, 6 received, 0% packet loss, time 25446ms
rtt min/avg/max/mdev = 39.155/40.463/42.323/1.125 ms

The thing I did notice was that between each line on the B3 there was quite a big wait (mayby 2-3 seconds at least), dont know if that suggests the dns lookup is delayed?

Re: Mail hang problem

Posted: 01 Apr 2013, 08:02
by Gordon
redw0001 wrote:The thing I did notice was that between each line on the B3 there was quite a big wait (mayby 2-3 seconds at least), dont know if that suggests the dns lookup is delayed?
Regular ping behaviour is once every second and it wouldn't do a DNS lookup for every individual ping - just the first one to determine where to ping.

What you're seeing is therefore most likely a performance issue and possibly the result of a memory leak. You should check memory usage (top, ps) and maybe reset services that take too much memory. I found on my B3 that Tor was using a huge amount of memory and changed the parameters to limit that. I also frequently reset the Logitech Mediaserver, mostly because it seizes itself during new media scans

Re: Mail hang problem

Posted: 01 Apr 2013, 16:39
by redw0001
A restart of the B3 'possibly' made matters worse. Certainly, it freed up memory from the memory stats on top. However, only an hour after the reboot the problem re-appeared, this is the shortest time so far.
Restarted and boot finished at approx 17:35
This is all I could find in syslog that suggested a problem, mail was ok after boot prior to this.
Apr 1 18:30:01 petra /USR/SBIN/CRON[2447]: (root) CMD (/etc/init.d/dovecot status >/dev/null 2>&1 || /etc/init.d/dovecot restart)
Apr 1 18:30:01 petra /USR/SBIN/CRON[2448]: (root) CMD (test -x /usr/bin/php && /usr/bin/php /usr/share/horde3/scripts/alarms.php)
Apr 1 18:30:01 petra /USR/SBIN/CRON[2449]: (root) CMD (test -x /usr/lib/web-admin/notify-dispatcher.pl && /usr/lib/web-admin/notify-dispatcher.pl)
Apr 1 18:34:10 petra fetchmail[1382]: 156 messages (156 seen) for jingle+xxxxx at mail.plus.net (5449367 octets).
Apr 1 18:34:29 petra dovecot: imap-login: Login: user=<xxxxx>, method=PLAIN, rip=91.125.38.69, lip=192.168.1.4, TLS
Apr 1 18:34:33 petra dovecot: imap-login: Login: user=<xxxxx>, method=PLAIN, rip=91.125.38.69, lip=192.168.1.4, TLS
Apr 1 18:34:35 petra fetchmail[1382]: getaddrinfo("mail.plus.net","pop3") error: Name or service not known
Apr 1 18:34:35 petra fetchmail[1382]: POP3 connection to mail.plus.net failed: Connection timed out
Apr 1 18:34:35 petra fetchmail[1382]: Query status=2 (SOCKET)

Re: Mail hang problem

Posted: 02 Apr 2013, 02:25
by Gordon
I don't think restarting the B3 will help you solve the issue. But at least now you know that it is something that is progressing in time and that does point to a memory leak (or a process that just allocates too much memory). Identify processes that are using 100Mb or (much) higher and ask yourself if you need those processes and whether it's possible to limit their memory usage. Likely suspects are Tor gateway, DLNA server, Torrent client.

Re: Mail hang problem

Posted: 12 Apr 2013, 20:03
by Cheeseboy
Gordon wrote:
redw0001 wrote:The thing I did notice was that between each line on the B3 there was quite a big wait (mayby 2-3 seconds at least), dont know if that suggests the dns lookup is delayed?
Regular ping behaviour is once every second and it wouldn't do a DNS lookup for every individual ping - just the first one to determine where to ping.
That's not quite true. I've seen that slow ping behaviour too against some hosts. Try ping -n against the same host and I bet you will see the difference.

You could try changing your configuration to use the IP address instead of mail.plus.net as nobody suggested.
Or you could try commenting out the UDP entry for "pop3" in your /etc/services file (not sure how the getaddrinfo function resolves that, might be worth a try).

It seems it is polling the server quite often. What is your poll time, and what is the timeout?
You can get information about timeouts and other stuff that fetchmail is doing with your configuration like this:
1. Stop the service:

Code: Select all

$ sudo /etc/init.d/fetchmail stop
2. Run it with the -V switch, as the fetchmail user, and using the configuration file of your B3:

Code: Select all

$ sudo -u fetchmail /usr/bin/fetchmail -f /etc/fetchmailrc -V
3. Make a note of the results, then start the service again:

Code: Select all

$ sudo /etc/init.d/fetchmail start
My timeout is 300 seconds, and my poll interval 3600. Could it be it is trying to often, with results still pending? If you look in the man page for fetchmail, it has some things to say about the timeout...

Cheers,

Cheeseboy

Re: Mail hang problem

Posted: 13 Apr 2013, 03:47
by Gordon
Cheeseboy wrote:
Gordon wrote:
redw0001 wrote:The thing I did notice was that between each line on the B3 there was quite a big wait (mayby 2-3 seconds at least), dont know if that suggests the dns lookup is delayed?
Regular ping behaviour is once every second and it wouldn't do a DNS lookup for every individual ping - just the first one to determine where to ping.
That's not quite true. I've seen that slow ping behaviour too against some hosts. Try ping -n against the same host and I bet you will see the difference.
Somewhat of a confusion here I'm afraid. I use ping so much on Windows machines from customers that I didn't think of any other than numeric output. Could be a nice check for TS to add the -n option and verify that the slowdown does indeed come from DNS lookups.

As I hinted before, the B3 does not use the DNS cache it provides for LAN connected machines itself but you can easily configure DNSmasq and the B3 network settings so that it does. This would fix the latency introduced by the DNS lookups.