1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Clock off in Roamio by 2 minutes

Discussion in 'TiVo Coffee House - TiVo Discussion' started by PSU_Sudzi, Dec 20, 2017.

  1. m.s

    m.s Active Member

    308
    131
    Mar 8, 2007
    No, they're still screwed up. Two of them have unknown refid, which probably means they're not sync'd to anything, just running on their local clock. When you checked, that they're all reporting stratum 6 just shows they're all misconfigured.
    time3.apple.com is stratum 1, so sjr3 should be reporting itself as stratum 2. But right now, 1 and 2 are reporting stratum 16.
    Code:
    locke:~# ntpdate -q -p1 time3.apple.com
    server 17.254.0.31, stratum 1, offset -0.000482, delay 0.09717
    13 Jan 10:25:42 ntpdate[21349]: adjust time server 17.254.0.31 offset -0.000482 sec
    locke:~# ntpdate -q -p1 sjr1.tivo.com
    server 204.176.49.10, stratum 16, offset 0.828403, delay 0.09518
    13 Jan 10:25:45 ntpdate[21350]: no server suitable for synchronization found
    locke:~# ntpdate -q -p1 sjr2.tivo.com
    server 204.176.49.11, stratum 16, offset 0.333987, delay 0.09563
    13 Jan 10:25:48 ntpdate[21353]: no server suitable for synchronization found
    locke:~# ntpdate -q -p1 sjr3.tivo.com
    server 204.176.49.12, stratum 6, offset 0.971833, delay 0.09564
    13 Jan 10:25:52 ntpdate[21359]: step time server 204.176.49.12 offset 0.971833 sec
    
    I'm pretty sure they're running Windows time service, and not the canonical ntpd. Ugh.
     
  2. dlfl

    dlfl Cranky old novice

    7,810
    273
    Jul 6, 2006
    Dayton OH
    Just got this:

    Using ntpdate -q on my raspberry pi. Doesn't look good (i.e., stratum 16) for sjr3. I'm a noobie to this ntp stuff. I believe the offsets are relative to the RPI's clock, right? It is being set by a daemon that runs automatically and is always within a second of WWV ("atomic clock") time even when the RPI runs for days. Stratum 16 means the server is unsynchronized which can't be good, even though the offset is low.

    EDIT: Now all three are stratum 6, with offsets ranging from 1 to 1.5 secs.
     
  3. HerronScott

    HerronScott Well-Known Member

    5,669
    582
    Jan 1, 2002
    Staunton, VA
    Windows time service works fine assuming your hardware and network infrastructure are configured correctly or at least we haven't seen any issues over the last 17 years here (1000 servers and 10,000 workstations) and yeah probably thus the refid as you mentioned, but stratum 6 or 16 is just wrong.

    Scott
     
  4. sharkster

    sharkster Well-Known Member TCF Club

    7,897
    1,329
    Jul 3, 2004
    NV
    I wish some of you guys worked at Rivo! You all seem to know way more than the ones forking everything up there.
     
  5. JoeKustra

    JoeKustra in the other Alabama TCF Club

    12,854
    1,744
    Dec 7, 2012
    Ashland, PA...
    I would be happy if they would just document the logs. Plus, we work for free.
     
  6. m.s

    m.s Active Member

    308
    131
    Mar 8, 2007
    For the definition of "fine" which only includes what Windows needs to function. It's poorly documented (the disciplining algorithms are completely undocumented), doesn't support reference clocks per se, and has questionable accuracy. By Microsoft's own admission:
    ntp or Chrony do a much better job.
     
    tim1724 likes this.
  7. dlfl

    dlfl Cranky old novice

    7,810
    273
    Jul 6, 2006
    Dayton OH
    Watching over the last hour or so, one of the servers will pop into stratum 16 for a little while then back to stratum 6. But the offsets all have been less than 2 secs.
     
  8. burdellgp

    burdellgp Member

    84
    11
    Mar 27, 2008
    Huntsville, AL
    In my case, I already have a stratum 1 server in the house (GPS receiver running in time-sync mode connected to an always-on server). I DNAT all UDP port 123 connections to the three TiVo IPs to my home server, then SNAT the source to the router (so the server replies to the router to reverse the NAT). This is on a router running LEDE (fork of OpenWRT).

    There's really no reason for TiVo to even be running these bad NTP servers; they could work with the public NTP pool project and use hostnames like 0.tivo.pool.ntp.org, 1.tivo.pool.ntp.org, 2.tivo.pool.ntp.org (and TiVos setting the clock would get a variety of sources). Ideally then TiVo would set up some servers to join the pool and help with the load, but not if they can't run one any better than this.
     
    tim1724, slowbiscuit and Razzer like this.
  9. HerronScott

    HerronScott Well-Known Member

    5,669
    582
    Jan 1, 2002
    Staunton, VA
    Yes, I'm fully aware of Microsoft's statement and the details above (actually queried Microsoft last year on the algorithms used by the Windows Time Service to answer some internal queries on the time sync). The Kerberos requirement for time synchronization for authentication and other Windows requirements are pretty broad. My point is that we have never seen issues like the TiVo NTP servers are displaying.

    Currently with 41 DC's in data centers and branch offices around the world, our greatest offset from our root is -0.0233637s (and our root's offset from the NIST NTP server is +0.0077669s) I'm sure ntp or Chony could do better but we're good with this level of accuracy and I'd be happy if TiVo's NTP servers were behaving this well. :)

    Scott
     
    Razzer likes this.
  10. kdmorse

    kdmorse Well-Known Member

    5,872
    451
    Jan 29, 2001
    Germantown, MD
    I suspect someone kicked the environment, got them all back in sync, got 2 of the 3 reporting stratum 2, and decided to watch to see if they stay that way. And they did, for a while.

    They all drifted back to stratum 6 (or 16/0) eventually, the major drifting started again, and sjr1 marched up to +8s before resetting. Now they're back to stratum 6/2/2, and mostly keeping sub 1s time.

    sjr2 has occasionally been reporting stratum 3 since. This tells me a human changed something. I've not seen a stratum 3 response in the 3 months I've been watching.

    But even so, it's clear they're doing minimal poking, and hoping that quick fixes make it all better. When in reality, if any of your servers are reporting stratum 4 or higher*, ever, you're not done fixing yet. They need to set up a cron script, once a minute, poll each server, and if any stratum is 6 or higher, email the admin...

    (while writing this, they've lost their sync again, and are all back to stratum 6)

    * excluding bizarre edge cases.
     
    tim1724 likes this.
  11. dlfl

    dlfl Cranky old novice

    7,810
    273
    Jul 6, 2006
    Dayton OH
    Here is what I just got:

    server 17.254.0.31, stratum 1, offset 0.00177 (time3.apple.com)

    server 204.176.49.10, stratum 2, offset 0.21881 (sjr1.tivo.com)

    server 204.176.49.11, stratum 2, offset 0.19219 (sjr2.tivo.com)

    server 204.176.49.12, stratum 16, offset 0.06090
    ERROR: 13 Jan 21:43:54 ntpdate[5473]: no server suitable for synchronization found (sjr3.tivo.com)

    I threw in the apple server, which indicates my local RPI clock is pretty close (This is usual).
    Ironic that the tivo server at stratum 16 has the smallest offset. (Even a stopped clock is accurate once or twice a day :))
     
  12. m.s

    m.s Active Member

    308
    131
    Mar 8, 2007
    23 ms is very large in the ntp world. Do you have any metrics for the frequency domain?
     
  13. sfhub

    sfhub Well-Known Member

    2,863
    472
    Jan 6, 2007
    It is silly this has gone on for so long. It is one thing if things didn't get reported to NOC because support didn't escalate, but this is after TiVo_Ted sent a note to them. Clearly TiVo NOC doesn't know how to configure NTP servers.

    They've got some kludge in to get the offsets within 7 seconds again.

    Since they didn't fix the core problem of drift, it will probably show up again whenever their kludge fails, but for now 99.9% of the people won't care if TiVo is within 7 seconds.
     
  14. ggieseke

    ggieseke Well-Known Member

    4,635
    252
    May 30, 2008
    sjr1 is way off again.
     
  15. JoeKustra

    JoeKustra in the other Alabama TCF Club

    12,854
    1,744
    Dec 7, 2012
    Ashland, PA...
    Almost 40 seconds.
     
  16. tim1724

    tim1724 Active Member TCF Club

    422
    67
    Jul 3, 2007
    Temple City, CA
    Over 51 seconds now.

    sequoia:~ tim$ ntpdate -q sjr{1,2,3}.tivo.com time.apple.com
    server 204.176.49.10, stratum 6, offset 51.154737, delay 0.04187
    server 204.176.49.11, stratum 6, offset 2.413703, delay 0.04134
    server 204.176.49.12, stratum 16, offset 0.195589, delay 0.04131
    server 17.253.26.125, stratum 1, offset 0.000098, delay 0.02762
    server 17.253.26.253, stratum 1, offset 0.000109, delay 0.02760
    server 17.253.2.125, stratum 1, offset -0.000152, delay 0.05566
    server 17.253.4.125, stratum 1, offset 0.000068, delay 0.03609
    server 17.253.4.253, stratum 1, offset -0.000041, delay 0.03696
    16 Jan 15:56:17 ntpdate[90631]: adjust time server 17.253.26.253 offset 0.000109 sec


    (The 204.* entries are TiVo, 17.* is Apple.)
     
  17. sharkster

    sharkster Well-Known Member TCF Club

    7,897
    1,329
    Jul 3, 2004
    NV
    One of mine was almost a minute fast again this morning. Glad I caught it early. A re-connection seemed to fix it - for now. I'm getting to where I don't have the mental energy for some of this babysitting I have to do, now, with the Tivos.
     
    unclehonkey likes this.
  18. tim1724

    tim1724 Active Member TCF Club

    422
    67
    Jul 3, 2007
    Temple City, CA
    sjr1 is over a minute off now (69 seconds) … they clearly haven't fixed anything.
     
  19. dlfl

    dlfl Cranky old novice

    7,810
    273
    Jul 6, 2006
    Dayton OH
    sjr1 at 88 sec now.
     
  20. hapster85

    hapster85 Active Member

    231
    57
    Sep 7, 2016
    I remember watching a recording, while I was off work over Christmas, that I thought the program had been delayed for whatever reason. Now I think it may have been this issue. Can't remember which recording now, though.
     

Share This Page