NTFS USB drive with filenames with Special Characters in Linux

#1

Has anybody else been plagued by a NTFS USB drive with Special Characters.
I used to be able to download shows with names that had special characters, but that no longer works. In fact, I can’t even rename some old files that contain them.

I suspect this to be due to the new linux 5 kernel ??? Special Characters work fine
on the /home partition that is EXT4.

#2

Your question is extremely vague in some areas.

Where are you loading shows from / with? vs Copying files from device to device on the save system? A work-around, rename then while on the NTFS before copying/moving to EXT4.

Which Linux distribution is using ver 5 kernel? If it’s a deal breaker, maybe re-install 4.x linux-image-amd64 (or what have you on your system / package repository)

#3

yeah, sorry - ticked that things broke out of the blue.

I am using tablo2go to download from the TabloTV over WiFi. When I try to download a show who’s name contains a “:” ( colon ) it pukes -

192.168.0.101 - 779027 - 2019-03-21 00:00Z - EP024345980051 - Downloading - ./TV/Riverdale/Riverdale - S03E16 - Chapter Fifty-One: BIG FUN
Traceback (most recent call last):
File “tablo2go-3.35-with-fixes_KK.py”, line 683, in
elif (get_video(QUEUE[TABLO_IP][airing_num][‘m3u8’], QUEUE[TABLO_IP][airing_num][‘build’])):
File “tablo2go-3.35-with-fixes_KK.py”, line 378, in get_video
with open(filename+’.ts’, ‘wb’) as fileObject:
OSError: [Errno 22] Invalid argument: ‘./TV/Riverdale/Riverdale - S03E16 - Chapter Fifty-One: BIG FUN.ts’
python3 tablo2go-3.35-with-fixes_KK.py -ip 192.168.0.101
[kkoceski@E4300XFCE TabloTV_1.5TB]$

I am using Arch Linux, and my present kernel is 5.0.3-arch1-1-ARCH

Thanks for listening - Kurt

#4

Ok I’m not specifically familiar with tablo2go, but just, from what I don’t know, the issue is with the python scrip aka tablo2go - not directly related to NTFS or kernel 5.

Just speculation, the line OSError: [Errno 22] Invalid argument: ‘./TV/Riverdale/Riverdale - S03E16 - Chapter Fifty-One: BIG FUN.ts’ the invalid argument may be related to that colon you mentioned… special charterers in filenames are bothersome.

There were similar issues with capto, the developer ended up modifying the code to remove all characters from filenames -derriered from episode titles- to eliminate errors.

Now I’m not a developer, this was just speculation.

#5

NTFS is a Microsoft file system. Would it not allow certain characters in path or filenames?

https://docs.microsoft.com/en-us/windows/desktop/fileio/naming-a-file#naming-conventions

Use any character in the current code page for a name, including Unicode characters and characters in the extended character set (128–255), except for the following:

The following reserved characters:
    < (less than)

        (greater than)

    : (colon)
    " (double quote)
    / (forward slash)
    \ (backslash)
    | (vertical bar or pipe)
    ? (question mark)
    * (asterisk)
#6

Well, tablo2go has “BAD_CHARS” in it that seems to indicate that it tries to remove Special Characters, but I still have names of show that contain them.

#7

Are you talking about special characters that may not be allowed in python or special characters that the file system doesn’t allow?

#8

Even though Windows doesn’t support Special Characters, Linux used to be able to use them on Microsoft formatted NTFS drives. Just recently, my Linux system can no longer access Special Characters on NTFS drives.

#9

NTFS is a proprietary file system. Maybe the linux distro you were using wasn’t conforming to the Microsoft standard and someone jerked their chain.

If Microsoft doesn’t allow these characters in file names and another OS does what would happen if the file system was mounted in Windows?

#10

Oh Yeah, I do know from the past, that if I mount this drive on a Windows system, I can “see” the files with Special Characters, but can do NOTHING with them.

#11

ok then, you might have answered your own question. Your Traceback does state OSError: but again, I’m just digging at things. I use debian buster so I don’t know the intricates of Arch but I have discovered they have the most current and detailed documentation. I understand Arch to be a “rolling” release, so if it’s been upgraded and now things are different, you may check in there.

#12

NFTS has changed versions over the years and various ntfs-3g drivers might have finally caught up… or worked around things.

#13

It has routines to remove characters and yet you have them… sounds flubbed-up, again I don’t use this app. (I’ve had success with capto).
The snipplet in your debug tracebak seems to be building file names for the HLS segments .ts files.

#14

So, I have to wonder if my drive is getting mounted differently now.

I found this -->

It says -->
If you’re using ntfs-3g to mount your NTFS filesystem, the windows_names option will prevent files with problematic names from being created:

ntfs-3g -o windows_names ...
#15

I’m not sure I found that option on the Linux man page. But there are man pages on how to properly unmount and mount a file system with user specified options.

http://man7.org/linux/man-pages/man8/mount.8.html

#16

I would back up a bit… if tablo2to is suppose to sanitize filenames, why are you even getting them in the first place!? I understand you were able to deal with it previously, but if they are routines to:

# Clean a string of all bad characters
# Default ASCII ranges 48-57, 65-90, 97-122, - _ . are allowed,
#  otherwise BAD_CHARS can be defined directly

and they still show up in filenames… why?

#18

Probably because scanning that set of ASCII characters died out 20 years ago and was replaced at the minimum with the latin-9/iso 8859 character set.

#19

I certainly don’t mean to discount the work and effort put into the application! Even though it may or may not be the best route, something’s getting through. I’m guess it started out as “it works for me” and elaborated on it so it could be shared with the community.

Although, I guess the OP question is why did my system work before and now it doesn’t. I think that’s better asked in Arch Linux support.

#20

What you cut and pasted is just a comment. I doubt that is how the scan code actually works. Otherwise French Canadian names would be chopped. That would probably make some people unhappy.

Each operating system has a minimum set of file system namespace rules. It just so happens that since NTFS is a Windows based file system it transcends multiple operating systems.

#21

That was from tablo2go-3.35-with-fixes.py Line 258 introducing
def clean(input, OVERRIDE={}): then a routine defining BAD_CHARS, elaborate conditional loop checks and returns results.

I’m not proficient with python language/sytax, I can just follow some of the logic and data flow to a point… So I may be waaay off.