Skip to content

Linux Notes

System/Hardware Info

Linux Distributions

根据包管理器进行的分类,主流的发行版有

  • apt: Debian, Ubuntu, Linux Mint
  • yum: CentOS, Fedora
  • dnf: Dandified Yum, RPM发行版的软件包管理器Yellowdog Updater, Modified(yum)的下一代版本
  • YaST: openSUSE
  • Pacman: Manjaro、ArchLinux

另外,根据这道 Shell 练习题

  • Redhat Series: Fedora, Gentoo, Redhat
  • Suse Series: Suse, OpenSuse
  • Debian Series: Ubuntu, Mint, Debian

参考

目前只接触 Ubuntu(个人笔记本) 和 CentOS, Rocky(服务器),所以本笔记主要针对这两种。

CPU vs Thread vs Core vs Socket

  • CPU: Central Processing Unit。概念比较宽泛,不同语境有不同含义,如 lscpu 便指 thread 个数。CPUs = Threads per core * cores per socket * sockets
  • CPU Socket: CPU 是通过一个插槽安装在主板上的,这个插槽就是 Socket;
  • Core: 一个 CPU 中可以有多个 core,各个 core 之间相互独立,且可以执行并行逻辑,每个 core 都有单独的寄存器,L1, L2 缓存等物理硬件。
  • Thread: 并不是物理概念,而是软件概念,本质上是利用 CPU 空闲时间来执行其他代码,所以其只能算是并发,而不是并行。
  • vCPU: 常见于虚拟核,也就是 Thread

G40 $ lscpu | grep -E '^Thread|^Core|^Socket|^CPU\('
CPU(s):                          4
Thread(s) per core:              2
Core(s) per socket:              2
Socket(s):                       1
表明其为 2 核 4 线程。

T460P $ $ lscpu | grep -E '^Thread|^Core|^Socket|^CPU\('
CPU(s):              4
Thread(s) per core:  1
Core(s) per socket:  4
Socket(s):           1
表明其为 4 核 4 线程。

参考 三分钟速览cpu,socket,core,thread等术语之间的关系

Locale

区域设置(locale),也称作“本地化策略集”、“本地环境”,是表达程序用户地区方面的软件设定。不同系统、平台、与软件有不同的区域设置处理方式和不同的设置范围,但是一般区域设置最少也会包括语言和地区。区域设置的内容包括:数据格式、货币金额格式、小数点符号、千分位符号、度量衡单位、通货符号、日期写法、日历类型、文字排序、姓名格式、地址等等。 source: 维基百科

locale 生效的顺序为

  1. LANGUAGE:指定个人对语言环境值的主次偏好,在 Ubuntu 中有这个环境变量,但似乎在 CentOS7.4 服务器上没有这个变量
  2. LC_ALL: 这不是一个环境变量,是一个可被C语言库函数setlocale设置的宏,其值可覆盖所有其他的locale设定。因此缺省时此值为空
  3. LC_xxx: 可设定locale各方面(category)的值,可以覆盖 LANG 的值。
  4. LANG: 指定默认使用的locale值

如若设置不当,可能会出现

$ locale
locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory
LANG=C.UTF-8
LC_CTYPE=C.UTF-8
LC_NUMERIC=en_US.UTF-8
LC_TIME=en_US.UTF-8
LC_COLLATE="C.UTF-8"
LC_MONETARY=en_US.UTF-8
LC_MESSAGES="C.UTF-8"
LC_PAPER=en_US.UTF-8
LC_NAME=en_US.UTF-8
LC_ADDRESS=en_US.UTF-8
LC_TELEPHONE=en_US.UTF-8
LC_MEASUREMENT=en_US.UTF-8
LC_IDENTIFICATION=en_US.UTF-8
LC_ALL=

则可以通过

export LC_ALL=en_US.UTF-8

来解决这个问题,这个可以写进 .bashrc 文件中,并且不需要 sudo 权限,而 How do I fix my locale issue? 中提到的几种方法需要 sudo 权限。

shared objects .so (dynamic library)

LD_PRELOAD

Note

Used in 🔗

If it is set on the same line before the command, it is only valid on the same line. For example,

(R4.1.0) /media/weiya/PSSD/Programs/anaconda3/envs/R4.1.0/lib/R/lib$ ldd libR.so 
        libstdc++.so.6 => /media/weiya/PSSD/Programs/anaconda3/envs/R4.1.0/lib/R/lib/./../.././libstdc++.so.6 (0x00007f80f4d1e000)

(R4.1.0) /media/weiya/PSSD/Programs/anaconda3/envs/R4.1.0/lib/R/lib$ LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libstdc++.so.6 ldd libR.so 
        /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fc70eb4c000)

Warning

it is should be a particular file instead of a folder, otherwise it throws

$ LD_PRELOAD=/usr/lib/x86_64-linux-gnu/ ldd libR.so 
ERROR: ld.so: object '/usr/lib/x86_64-linux-gnu/' from LD_PRELOAD cannot be preloaded (cannot read file data): ignored.

In the words of 🔗,

LD_PRELOAD is an environment variable, it affects only the current process

Alternatively, we can use export and unset,

~/Programs/anaconda3/envs/R4.1.0/lib/R/lib$ export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libstdc++.so.6
~/Programs/anaconda3/envs/R4.1.0/lib/R/lib$ ldd libR.so 
        /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f601f456000)
~/Programs/anaconda3/envs/R4.1.0/lib/R/lib$ conda activate R4.1.0
(R4.1.0) ~/Programs/anaconda3/envs/R4.1.0/lib/R/lib$ echo $LD_PRELOAD 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6
(R4.1.0) ~/Programs/anaconda3/envs/R4.1.0/lib/R/lib$ ldd libR.so 
        /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fb231b15000)

BTW, in the above, conda activate does not affect.

Warning

export LD_PRELOAD did not work in GitHub Actions, such as 🔗. The proper way is to put it to env. 🔗

Tip

The documentation can be found in man ld.so, not man ld.

search order

An executable look for shared objects (.so files) in the following locations in order

  • rpath
  • LD_LIBRARY_PATH
  • runpath
  • /etc/ld.so.conf
  • /lib
  • /usr/lib

refer to

So there are several ways to fix the NotFound error,

# method 1
sudo ln -s /where/your/lib/*.so /usr/lib
sudo ldconfig
# method 2
export LD_LIBRARY_PATH=/where/your/lib:$LD_LIBRARY_PATH`
sudo ldconfig
# method 3
sudo echo "where/your/lib" >> /etc/ld.so.conf
sudo ldconfig

update-alternatives

可以通过 update-alternatives 进行切换,但注意要提前安装 install alternatives,这里的 install 不是下载源码安装,而是将系统中已有的不同版本的 gcc 安装到 alternatives 中。比如当前我电脑的 gcc –version 是 7.5.0,但是仍有 gcc-5, gcc-4.8 等命令,不过这些并不在 alternatives 中,因为如果直接运行

$ sudo update-alternatives --config gcc
update-alternatives: error: no alternatives for gcc

所以可以按照 How to switch GCC version using update-alternatives

sudo update-alternatives --install ....

然后再 config.

Kill Processes

参考linux下杀死进程(kill)的N种方法

ps -ef | grep R
kill -s 9 ...

其中 ps -ef 输出格式为

$ ps -ef | head -2
UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 09:15 ?        00:00:44 /sbin/init splash

每一列的含义可以在 man ps 中的 STANDARD FORMAT SPECIFIERS 小节中找到,具体地,

  • UID: same with EUID, effective user ID (alias uid).
  • PID: a number representing the process ID (alias tgid).
  • PPID: parent process ID.
  • C: processor utilization. Currently, this is the integer value of the percent usage over the lifetime of the process. (see %cpu).
  • STIME: same with START, starting time or date of the process. Only the year will be displayed if the process was not started the same year ps was invoked, or “MmmDD” if it was not started the same day, or “HH:MM” otherwise. See also bsdstart, start, lstart, and stime.
  • TTY: controlling tty (terminal). (alias tt, tty).
  • TIME: cumulative CPU time, “[DD-]HH:MM:SS” format. (alias cputime).
  • CMD: see args. (alias args, command). when the arguments to that command cannot be located, 会被中括号 [] 包起来

user vs. sys

control android phone by PC’s mouse and keyboard

How to Control Your Android Using Your Computer’s Mouse and Keyboard

Run in Background

但是 jobs 只显示属于当前 shell 的后台程序, 如果重新登录,则不会显示后台程序,详见 jobs command doesn’t show any background processes

似乎 docker run 使用 nohup 会直接退出,并生成文件 nohup.txt,里面内容为“the input device is not a TTY”.

crontab定时任务

* */1 * * * * 表现为每分钟执行,但是本来第 1 列应当为分钟,而第 2 列为小时,这样使用对用法理解错误,而且改成 * * */1 * * * 仍然表现为每分钟。试图

sudo service cron restart
# or
sudo service cron reload

都失败了。所以还是理解出现了偏差,

参考Linux 设置定时任务crontab命令关于定时执行任务:Crontab的20个例子

gvim fullscreen

refer to Is there a way to turn gvim into fullscreen mode?

In short,

  1. install wmctrl
  2. map F11 via .vimrc
find -L . -name . -o -type d -prune -o -type l -exec rm {} +

Delete all broken symbolic links with a line?

hydrogen specify the conda environment

just need to run

source activate thisenv
python -m ipykernel install --user --name thisenv

and only once, hydrogen will remember this!!

ref to How to specify the conda environment in which hydrogen (jupyter) starts?

.netrc

为了学习 RL,在听了周博磊在 B 站的视频后,准备玩下示例代码,但是在终端中创建新 conda 环境时,

conda create --name RL python=3

总是报错,

Collecting package metadata (current_repodata.json): failed

ProxyError: Conda cannot proceed due to an error in your proxy configuration. Check for typos and other configuration errors in any ‘.netrc’ file in your home directory, any environment variables ending in ‘_PROXY’, and any other system-wide proxy configuration settings.

其中提到一个 .netrc,没想到自己竟然还真的有这个文件,看了下内容,只有两条,

machine api.heroku.com
...
machine git.heroku.com
...

这才意识到很早无意识中折腾 heroku 时创建的。那这个文件是干嘛的呢,查了一下发现

This is a file that is often used by Unix programs to hold access details for remote sites. It was originally created for use with FTP.

最后这个问题是直接把 .bashrc 中所有的代理去掉了.

GPG error

$ sudo apt-get update
W: An error occurred during the signature verification. The repository is not updated and the previous index files will be used. GPG error: https://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/ InRelease: The following signatures were invalid: EXPKEYSIG 51716619E084DAB9 Michael Rutter <marutter@gmail.com>
W: Failed to fetch https://cloud.r-project.org/bin/linux/ubuntu/bionic-cran35/InRelease  The following signatures were invalid: EXPKEYSIG 51716619E084DAB9 Michael Rutter <marutter@gmail.com>
W: Some index files failed to download. They have been ignored, or old ones used instead.

and got the expired key via

$ apt-key list
pub   rsa2048 2010-10-19 [SCA] [expired: 2020-10-16]
...
uid           [ expired] Michael Rutter <marutter@gmail.com>

but it seems not work following How to solve an expired key (KEYEXPIRED) with apt

$ apt-key adv --keyserver keys.gnupg.net --recv-keys 51716619E084DAB9
Executing: /tmp/apt-key-gpghome.CYSI3C6heK/gpg.1.sh --keyserver keys.gnupg.net --recv-keys 51716619E084DAB9
gpg: key 51716619E084DAB9: "Michael Rutter <marutter@gmail.com>" not changed
gpg: Total number processed: 1
gpg:              unchanged: 1

then I tried another keyserver mentioned in Installing R from CRAN Ubuntu repository: No Public Key Error

$ sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 51716619E084DAB9
[sudo] password for weiya:
Executing: /tmp/apt-key-gpghome.xUS3ZEg8N2/gpg.1.sh --keyserver keyserver.ubuntu.com --recv-keys 51716619E084DAB9
gpg: key 51716619E084DAB9: "Michael Rutter <marutter@gmail.com>" 2 new signatures
gpg: Total number processed: 1
gpg:         new signatures: 2

Now, new signatures come, and no expired again.

Another one,

W: An error occurred during the signature verification. The repository is not updated and the previous index files will be used. GPG error: https://download.opensuse.org/repositories/Emulators:/Wine:/Debian/xUbuntu_18.04 ./ InRelease: The following signatures were invalid: EXPKEYSIG DFA175A75104960E Emulators OBS Project <Emulators@build.opensuse.org>
W: Failed to fetch https://download.opensuse.org/repositories/Emulators:/Wine:/Debian/xUbuntu_18.04/./InRelease  The following signatures were invalid: EXPKEYSIG DFA175A75104960E Emulators OBS Project <Emulators@build.opensuse.org>
W: Some index files failed to download. They have been ignored, or old ones used instead.

according to the record on WeChat in Linux, it seems that this one is not important, and for simplest, I just untick this repository in the software setting.

sftp via File Manager

在用 connect to server 时,经常弹出窗口要求输入用户名及密码,格式为 sftp://xxx.xxx.xx.xx,如果避免输入密码,不妨采用 sftp://user@xxx.xxx.xx.xx。不过有时在登录其它服务器时,不指定用户名还是直接登进去了,不太清楚为什么,猜想会不会是这几个服务器的用户名刚好跟本地相同。

You have new mail

Here is a message when I login in to the Office PC,

You have new mail.
Last login: Thu May 20 13:29:14 2021 from 127.0.0.1

Refer to “You have mail” – How to Read Mail in Linux Command Line, the message is stored in the spool file, which is located at /var/mail/$(whoami). The I found that this is a failed email when I wrote the mail notification script when there are new error message in /var/log/apache2/error.log.

systemd

systemd is a system and session manager for Linux, compatible with SysV and LSB init scripts. systemd

  • provides aggressive parallelization capabilities,
  • uses scoket and D-Bus activation for starting services,
  • offers on-demand starting of daemons,
  • keeps track of processes using Linux cgroups
  • supports snapshotting and restoring of the system state
  • maintains mount and automount points
  • implements an elaborate transactional dependency-based service control logic.

control systemd once booted

The main command used to control systemd is systemctl.

  • systemctl list-units: list all units
  • systemctl start/stop [NAME]: start/stop (activate/deactivate) one or more units
  • systemctl enable/disable [NAME]: enable/disable one or more unit files
  • systemctl reboot: shut down and reboot the system

Warning

WSL does not support systemd, for example, it throws

$ systemctl list-units
System has not been booted with systemd as init system (PID 1). Can't operate.

see also

See also: systemd/User - ArchWiki

locally via --user

the service can be also set up locally, such as ~/.local/share/systemd/user/ssh4lab.service in 🔗

we can check the status and manage it via

systemctl --user status/stop/start/disable/enable ssh4lab

see also: 🔗

System Monitor

I am currently using the gnome-shell extension and Netdata: Web-based Real-time performance monitoring

Info

Other candidates:

  • ram_available: percentage of estimated amount of RAM available for userspace processes, without causing swapping
  • ram_in_use: system memory utilization
  • 30min_ram_swapped_out: percentage of the system RAM swapped in the last 30 minutes (???)
  • system.load: 系统负载平均值(system load averages),它将正在运行的线程(任务)对系统的需求显示为平均运行数和等待线程数。Linux load averages 可以衡量任务对系统的需求,并且它可能大于系统当前正在处理的数量,大多数工具将其显示为三个平均值,分别为 1、5 和 15 分钟值(参考 Linux Load Averages:什么是平均负载? - 知乎)。
    • load_average_1: system one-minute load average
    • load_average_5: system five-minute load average
    • load_average_15: system fifteen-minute load average
    • 如果平均值为 0.0,意味着系统处于空闲状态
    • 如果 1min 平均值高于 5min 或 15min 平均值,则负载正在增加
    • 如果 1min 平均值低于 5min 或 15min 平均值,则负载正在减少
    • 如果它们高于系统 CPU 的数量,那么系统很可能会遇到性能问题
  • python.d_job_last_collected_secs: number of seconds since the last successful data collection
  • system.swap
    • used_swap: swap memory utilization
  • system.cpu
    • 10min_cpu_usage: average CPU utilization over the last 10 minutes (excluding iowait, nice and steal)
    • 10min_cpu_iowait: average CPU iowait time over the last 10 minutes
  • ipv4.udperrors
    • 1m_ipv4_udp_receive_buffer_errors: average number of UDP receive buffer errors over the last minute
  • disk_space_usage: disk / space utilization
  • linux_power_supply_capacity: percentage of remaining power supply capacity
  • 10s_ipv4_tcp_resets_received: average number of received TCP RESETS over the last 10 seconds. This can be an indication that a service this host needs has crashed. Netdata will not send a clear notification for this alarm.
  • net.enp2s0
    • 1m_received_traffic_overflow: average inbound utilization for the network interface enp2s0 over the last minute: check if there are attempts to attack the server via /var/log/secure, refer to 详解CentOS通过日志反查入侵
  • net_fifo.enp2s0.10min_fifo_errors: number of FIFO errors for the network interface enp2s0 in the last 10 minutes possibly the indicator for much overflow to the disk

Get history of other tty/pts?

seems not.

see also How to get complete history from different tty or pts - Stack Overflow

Comments