[Etherlab-users] Best filesystem for DLS?

Wed Mar 24 15:41:19 CET 2021

Hallo Sebastien,

Am 24.03.21 um 15:04 schrieb Sebastien BLANCHET:
> Dear Wilhelm,
> 
> You are right, there was a configuration error in job.xml the block size
> was only 10. I have increased it to 2000 to get fewer files.

that is not really a configuration error. It is the default value ;-)
depending on your sample frequency giving you a latency of 10 seconds.

A blocksize of 2000 for 1Hz means that new blocks get written only every
approx half an hour (2000 s). So you only are able to see new data (in
dlsgui) after this time. If that is to much latency, reduce the
blocksize to 60 e.g.

Anyway greater blocksizes result in better compression and less overhead.

Increasing "Reduction" has more impact on file count, because it
influences the creation of "meta levels". Doubling that value (default
30) should be no problem.

DLS is definitely the right tool to store hugh amount of data over a
long period, but to be honest we never had jobs with 20k signals up
until now.

It would be much appreciated if you later could share your experience
with other filesystems.

Regards

Wilhelm Hagemeister

> 
> I will do some tests with other filesystems that can handle more files.
> 
> I am sampling 20K signals at 1Hz without trigger and I keep them for one
> year for technical forensic analysis.
> 
> I do not know if dls is designed to operate at such scale, but it is the
> easiest archiving tool I have ever found and it seems to support the
> workload.
> 
> Best Regards,
> ---
> Sebastien BLANCHET
> 
> 
> 
> On 3/23/21 5:30 PM, Dr.-Ing. Wilhelm Hagemeister wrote:
>> Hi Sebastien,
>>
>> increasing the number of inodes helps.
>>
>> dumpe2fs -h /dev/sdxx helps to see what are your limits now.
>>
>> Regarding dls please check, if you get restarts of the sampling
>> processes due to sample time constrains. Usually this is the case with
>> data sources form realtime processes with high jitter. This results in
>> creating new chunks (and a lot of new files). So have a look in your
>> channel directory (or in dlsgui) if you have lots of chunks which have a
>> breaks of only a few seconds.
>> Also you get a log entry in your messages (or journalctl) saying
>> something like: "Time diff of xxx us (expected xxx us, error is percent
>> xxx %)"
>>
>> Also increasing the numbers for blocksize and reduction in the channel
>> setup decreases the number of files.
>>
>> Are you sampling constantly or just small chunks with a trigger for your
>> job? If later is the case, try to increase chunksize by constant
>> sampling over longer periods.
>>
>> "btrfs" and "xfs" have no inode limitation but we have no experience
>> with that.
>>
>> Regards Wilhelm Hagemeister
>>
>> Am 23.03.21 um 17:05 schrieb Merkel, Amos:
>>> Hi,
>>>
>>>
>>> I have never tried, but couldn't you simply manually define a
>>> higher bytes-per-inode ratio with mkfs.ext4 -i ?
>>>
>>> Arch-Wiki has some explanations on the topic:
>>> https://wiki.archlinux.org/index.php/Ext4#Bytes-per-inode_ratio
>>> <https://wiki.archlinux.org/index.php/Ext4#Bytes-per-inode_ratio>
>>>
>>>
>>> Greetings,
>>>
>>> Amos
>>>
>>> ------------------------------------------------------------------------
>>> *Von:* Etherlab-users <etherlab-users-bounces at etherlab.org> im Auftrag
>>> von blanchet at iram.fr <blanchet at iram.fr>
>>> *Gesendet:* Dienstag, 23. März 2021 16:35:28
>>> *An:* etherlab-users at etherlab.org
>>> *Betreff:* [Etherlab-users] Best filesystem for DLS?
>>>   Hi,
>>>
>>> Short version:
>>> -------------------
>>> What is the best filesystem for DLS data ?
>>>
>>> Long version:
>>> -------------------
>>> I am running DLS to archive a lot of signals (about 20K), and I am
>>> facing an issue with the number of files.
>>>
>>> I have a 1TB EXT4 filesystem to store the data. It is 50% full but there
>>> is no more free inodes, because DLS creates many files and directories.
>>> According to the output of “df -i”, DLS has already created about 78M of
>>> files.
>>>
>>> I understand that I have to switch to another filesystem that supports a
>>> bigger number of files.
>>> And I wonder, what is the best filesystem for such a case.
>>>
>>> Regards,
>>> —
>>> Sebastien BLANCHET
>>> -- 
>>> Etherlab-users mailing list
>>> Etherlab-users at etherlab.org
>>> https://lists.etherlab.org/mailman/listinfo/etherlab-users
>>> <https://lists.etherlab.org/mailman/listinfo/etherlab-users>
>>>
>>
>>
>